尝试从日志文本 (.txt) 文件中搜索不区分大小写的关键字

我有一个对话的日志文件。我想在文件中搜索我分配的某些关键字，但日志文件可能包含我正在搜索的关键字的大写、小写和标题区分大小写的单词。

我可以提取具有小写关键字的轮廓，但无法获得单词的大写或标题大小写版本。我该如何解决这个问题？

我试过使用

if (words.title() and words.lower()) in line:
     print (searchInLines[i])

但这似乎行不通。

keywords=['bimbo', 'qualified', 'tornadoes', 'alteryx', 'excel', 'manchester']

with open("recognition_log.txt", "r", encoding="utf8") as f:
    searchInLines = f.readlines()
    f.close()
for words in keywords:
    for i, line in enumerate(searchInLines):
        if (words.title() and words.lower()) in line:
            print (searchInLines[i])

例如，日志文件包含以下句子：

"曼联昨天对阵巴塞罗那，然而，曼联输了">

我的关键字中有"曼彻斯特"，所以它会选择第二个，但不会选择第一个。

我怎样才能识别两者？

提前感谢！

我不

完全确定您要做什么，但我认为它正在过滤掉包含keywords中单词之一的消息(行(。这是一个简单的方法：

keywords=['bimbo', 'qualified', 'tornadoes', 'alteryx', 'excel', 'manchester']
with open("recognition_log.txt", "r", encoding="utf8") as f:
    searchInLines = f.readlines()
    f.close()
for line in searchInLines:
    for keyword in keywords:
        if keyword in line.lower():
            print(line)

使用正则表达式

前任：

import re
keywords=['bimbo', 'qualified', 'tornadoes', 'alteryx', 'excel', 'manchester']

with open("recognition_log.txt", "r", encoding="utf8") as f:
    searchInLines = f.readlines()
#pattern = re.compile("(" + "|".join(keywords) + ")", flags=re.IGNORECASE)
pattern = re.compile("(" + "|".join(r"b{}b".format(i) for i in keywords) + ")", flags=re.IGNORECASE)
for line in searchInLines:
    if pattern.search(line):
        print(line)

首先，当你使用上下文管理器时，你不需要f.close((。

至于解决方案，我建议您在这种情况下使用正则表达式

import re
keywords=['bimbo', 'qualified', 'tornadoes', 'alteryx', 'excel', 'manchester']
# Compiling regext pattern from keyword list
pattern = re.compile('|'.join(keywords))
with open("recognition_log.txt", "r", encoding="utf8") as f:
    searchInLines = f.readlines()
for line in searchInLines:
    # if we get a match
    if re.search(pattern, line.lower()):
        print(line)

您可以将行和关键字转换为大写或小写并进行比较。

keywords = ['bimbo', 'qualified', 'tornadoes', 'alteryx', 'excel', 'manchester']
with open("test.txt", "r", encoding="utf8") as f:
    searchInLines = f.readlines()
    f.close()
for words in keywords:
    for i, line in enumerate(searchInLines):
        if words.upper() in line.upper():
            print(searchInLines[i])

(1(好吧，你的话是小写的，所以"words.lower(("没有效果。(2(如果你没有"曼彻斯特"和"曼彻斯特"，你的例句就找不到了，因为你使用的是"和"逻辑。(3(我相信，你想要的是："如果行中的单词.lower((：">

相关内容

最新更新

热门标签：