如何在Python 3.8中过滤两个字典列表



如果主题已经存在,请原谅我,但是我没有找到它…

我有三个字典列表:

list_1 = [
{'name': "Leonardo Di Caprio", 'films': ["The revenant", "Titanic", "The wold of Wall Street"]},
{'name': "Will Smith", 'films': ["I am a legend", "The pursuit of happyness"]},
{'name': "Robert De Niro", 'films': ["Taxi driver", "The godfather"]}
]
list_2 = [
{'name': "Leonardo Di Caprio", 'films': ["Titanic", "The revenant", "The wold of Wall Street"]},
{'name': "Will Smith", 'films': ["I am a legend", "The pursuit of happyness", "Aladdin"]},
{'name': "Robert De Niro", 'films': ["Taxi driver", "The godfather"]}
]
list_final = [
{'name': "Tom Hanks", 'films': ["Forest Gump", "Cast Away", "Greyhound"]},
{'name': "Will Smith", 'films': ["I am a legend", "The pursuit of happyness"]},
{'name': "Tom Cruise", 'films': ["Top Gun", "Mission impossible"]},
{'name': "Robert De Niro", 'films': ["Taxi driver", "The godfather"]},
{'name': "Leonardo Di Caprio", 'films': ["Titanic", "The revenant", "The wold of Wall Street"]},
{'name': "Harrison Ford", 'films': ["Blade Runner", "Indiana Jones"]},
{'name': "Morgan Freeman", 'films': ["Seven"]}
]

我想创建一个函数,以2字典列表作为参数,并返回一个布尔值。目的是检查list_1是否包含在list_final中。
By "is contains "我的意思是:

  • list_1中的每个角色名必须出现在list_final中(无论顺序如何)
  • list_1特定演员出演的每一部电影必须出现在list_final

我有一个功能代码:

def isContained(l1 : List[Dict[str, List]], l_final: List[Dict[str, List]]) -> bool:
for elem in l1:
findOccurence = False
for element in l_final:
if elem['name'] == element['name'] and all(item in element['films'] for item in elem['films']):
findOccurence = True
if not findOccurence:
return False
return True
print(isContained(list_1, list_final)) # True
print(isContained(list_2, list_final)) # False
print(isContained(list_1, list_2)) # True
print(isContained(list_2, list_1)) # False

输出:

root@root:/tmp/TEST_PYTHON$ python3 main.py
True
False
True
False

所以它工作,但我相信有另一种方法可以用更优化的算法来编码它。让我困扰的是对整个最终列表的迭代次数和对list_1

的迭代次数一样多有什么建议吗?

稍微调整一下你的数据结构,让事情更有效率一点之后…

它可以利用相交算子&和集合上的is子集方法来完成。

list_1 = {
"Leonardo Di Caprio":{"The revenant", "Titanic", "The wold of Wall Street"},
"Will Smith":{"I am a legend", "The pursuit of happyness"},
"Robert De Niro": {"Taxi driver", "The godfather"}
}
list_2 = {
"Leonardo Di Caprio": {"Titanic", "The revenant", "The wold of Wall Street"},
"Will Smith": {"I am a legend", "The pursuit of happyness", "Aladdin"},
"Robert De Niro": {"Taxi driver", "The godfather"}
}
list_final = {
"Tom Hanks": {"Forest Gump", "Cast Away", "Greyhound"},
"Will Smith": {"I am a legend", "The pursuit of happyness"},
"Tom Cruise": {"Top Gun", "Mission impossible"},
"Robert De Niro": {"Taxi driver", "The godfather"},
"Leonardo Di Caprio": {"Titanic", "The revenant", "The wold of Wall Street"},
"Harrison Ford": {"Blade Runner", "Indiana Jones"},
"Morgan Freeman": {"Seven"}
}

def isContained(l1, l_final) -> bool:
if (set(l1.keys()).issubset(set(l1.keys()))):
for key in set(l1.keys()) & set(l_final.keys()):
if (not (l1[key].issubset(l_final[key]))):
return False;
return True;

在@Stef的评论之后,解决方案已经修复,现在是:

def isContained(l1, l_final) -> bool:
if (set(l1.keys()).issubset(set(l_final.keys()))):
for key in set(l1.keys()) & set(l_final.keys()):
if (not (l1[key].issubset(l_final[key]))):
return False;
else:
return False
return True;

需要一个额外的测试条件来确认第一个条件是否正确满足…

list_3 = {"Sigourney Weaver": {"Aliens"}}

和条件:

print(isContained(list_3, list_final)) # False

最新更新