如何在2个相关列表中找到最常见的名字



我想向社区寻求帮助。我这里有两个相关的列表:

names = ['alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady']
votes = [True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True]

votes列表是人脸识别算法匹配相应的names列表的结果。然后我将把每个True投票与相应的名字联系起来,并找到出现频率最高的名字作为最终的"获胜者"。

我试过两种方法:

characters = {}
for name, vote in list(zip(names, votes)):
if vote == True:
characters[name] = characters.get(name, 0) + 1
#print(characters)
print(max(characters, key=characters.get))

输出为'owen_grady'

from collections import Counter
characters = [name for name, vote in list(zip(names, votes)) if vote == True]
#print(characters)
print(Counter(characters).most_common()[0][0])

输出也是"owen_grady"。哪一种方式更有效率:字典?还是带有计数器的列表推导?

我的终极问题:有没有其他方法(最有效的))才能得到结果?我希望输出只是'owen_grady'

您可以使用itertools.compress()过滤所有false条目。选项与Counter应该是最有效的,只需使用n参数在.most_common()让它返回一个单对。代码:

from itertools import compress
from collections import Counter
names = ['alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady']
votes = [True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True]
most_common = Counter(compress(names, votes)).most_common(1)[0][0]
# Or with some syntax sugar:
# [(most_common, _)] = Counter(compress(names, votes)).most_common(1)

乌利希期刊指南。我做了一些基准测试,似乎对于这个特殊情况,稍微优化的第一种方法表现出更好的性能:

from itertools import compress
names = ['alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady']
votes = [True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True]
characters = list(compress(names, votes))
most_common = max(set(characters), key=characters.count)

除了zip()功能之外,您可以尝试使用Counter模块解决方案:

names = ['alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'alan_grant', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'claire_dearing', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ellie_sattler', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'ian_malcolm', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'john_hammond', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady', 'owen_grady']
votes = [True, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, False, True, True, True, True]
from collections import Counter
import itertools
r = Counter(zip(names, votes))
for i in list(r.keys()):
if i[1] == False:
del r[i]
print(r)

输出:

Counter({('owen_grady', True): 4, ('alan_grant', True): 1})

最新更新