我有一个包含几个整数列表的列表,我想找到具有最大公共元素的列表。
我尝试使用交集,但它返回一个空集合,因为这里的交集涉及在我的列表中找到的所有列表的的公共元素。我希望我的代码,以显示我的列表有共同的整数,我想要的。举个例子,如果我想要,有3个相同整数的列表,它就会向我显示有问题的列表。我在网上搜索了很多,但我只能找到推理来确定两个列表是否相同。
下面是交集的代码:
import string
list = [[3,5,9], [4,6,6], [4,7], [2,7], [2,1,4,5], [1,2,4,6], [3,3], [3,3], [3,2,1], [3,2]]
result = set.intersection(*map(set,list))
print(result)
结果如下:
set()
但是我想要的是:
[2,1,4,5],[1,2,4,6]
第一个错误答案(误解需求)
data = [[3, 5, 9], [4, 6, 6], [4, 7], [2, 7], [2, 1, 4, 5], [1, 2, 4, 6], [3, 3], [3, 3], [3, 2, 1], [3, 2]]
max_unique_elements = 0
holding = []
for data_list in data:
unique_elements = len(set(data_list))
if unique_elements > max_unique_elements:
holding = [data_list]
max_unique_elements = unique_elements
elif unique_elements == max_unique_elements:
holding.append(data_list)
print(holding)
第二个(我相信)正确答案。请注意,这不是最优的,如评论中所述,如果两个或多个集合具有最大交集(最大数量的公共元素),则会给出不正确的答案。此外,由于使用集合的方法,每个元素只会出现一次,例如[2,3,3,4,6]将输出为[3,2,4,6](不保留顺序)。我会尽快解决这些问题,但我现在在度假,这应该给你如何解决这个问题的要点。
data = [[3, 5, 9], [4, 6, 6], [4, 7], [2, 7], [2, 1, 4, 5], [1, 2, 4, 6], [3, 3], [3, 3], [3, 2, 1], [3, 2]]
# set default to fist element of first list
most_common_count = 0
max_intersection = 0
sets_with_max_intersection = []
# sets remove any duplicates, as duplicates only count once
# (e.g. [4, 6, 6] and [6, 2, 6] only have one element in common)
# this makes processing easier
data_sets = [set(data_list) for data_list in data]
# count the number of sets which each element occurs in
for index, data_set_1 in enumerate(data_sets):
for data_set_2 in data_sets[index + 1:]:
union_result = data_set_1.intersection(data_set_2)
# new greatest union found
if len(union_result) > max_intersection:
max_intersection = len(union_result)
sets_with_max_intersection = [data_set_1, data_set_2]
# equal length to max union assume part of same group and add
# note: will give erroneous result if two or groups of sets
# have the same number of elements in common
elif len(union_result) == max_intersection:
if data_set_1 not in sets_with_max_intersection:
sets_with_max_intersection.append(data_set_1)
if data_set_2 not in sets_with_max_intersection:
sets_with_max_intersection.append(data_set_2)
print(max_intersection)
print(sets_with_max_intersection)
你想要的是基于在交集中有3个项目的条件来过滤列表对。
您可以通过使用itertools.combinations
获得所有对,并使用列表推导式过滤它们:
import string
from itertools import combinations
list_ = [[3,5,9], [4,6,6], [4,7], [2,7], [2,1,4,5], [1,2,4,6], [3,3], [3,3], [3,2,1], [3,2]]
print([c for c in combinations(list_, r=2) if len(set(c[0]) & set(c[1])) == 3])
输出符合要求:
[([2, 1, 4, 5], [1, 2, 4, 6])]