检查使用Python的文本中是否存在不同的单词组合



我想编写一个函数,查找文本中的某些单词组合并告诉它属于哪个列表。例子:

my_list1 = ["Peter Parker", "Eddie Brock"]
my_list2 = ["Harry Potter", "Severus Snape", "Dumbledore"]
Example input: "Harry Potter was very sad"
Example output: my_list1

可以遍历字符串,然后将出现的单词添加到列表中,然后检查出现次数最多的单词,以确定整个字符串属于哪个列表:

my_list1 = ["Peter Parker", "Eddie Brock"]
my_list2 = ["Harry Potter", "Severus Snape", "Dumbledore"]

to_check = "Harry Potter was very sad"
def which_list(to_check):
belong_l1 = 0
belong_l2 = 0    
for i in to_check:
if i in my_list1:
belong_l1 += 1
elif i in my_list2:
belong_l2 += 1
if belong_l1 > belong_l2:
print("string belongs to list 1")
elif belong_l1 < belong_l2:
print("string belongs to list 2")
else:
print("belonging couldn't be determined")

首先,我将把名字包含在列表中。

lst = (
("list1", ("Peter Parker", "Eddie Brock")),
("list2", ("Harry Potter", "Severus Snape", "Dumbledore")),
("list3", ("Harry Potter",)),
)
while True:
txt = input( "Enter some text: ")
if len(txt) == 0: break
for names in lst:
for name in names[1]:
if name in txt:
print( f"'{name}' found in {names[0]}.")

和结果:

Enter some text: Harry Potter was here
'Harry Potter' found in list2.
'Harry Potter' found in list3.
Enter some text: Harry Potter and Dumbledore were here
'Harry Potter' found in list2.
'Dumbledore' found in list2.
'Harry Potter' found in list3.

假设数据结构如下:

lists = {
'my_list1': ["Peter Parker", "Eddie Brock"],
'my_list2': ["Harry Potter", "Severus Snape", "Dumbledore"]
}

和这个参数:

arg = "Harry Potter was very sad"

此推导式将返回包含参数中任何关键字的所有列表的名称:

list_names = [
list_name
for list_name, keywords in lists.items()
if any(kw in arg for kw in keywords)
]