从与另一个列表没有共同元素的总体生成随机样本列表



我希望从列表中采样元素,以便该元素不包含在另一个指定元素列表中。我希望继续生成新样本,直到生成一个非相交的样本。这个,下面的代码是我想到的,但是每当有一个相交的初始样本时它都不起作用,它进入无限循环,打印显示所有生成的样本都是相同的。

import random 
unique_entities=['100','1001','10001','100001','11111']
pde_fin= ['2151', '2146', '2153', '2135', '2158', '2160', '2137', '2169', '2147', '2015', '2022', '2173', '2028', '2014', '2018', '2009', '1140', '1085', '1136', '1132', '1007', '1080', '1078', '1131', '1106', '1164', '1092', '1108', '1118', '1045', '1051', '1006','1001']
random_entities=random.sample(unique_entities,3) #choses 5 unique entities 
while(not(set(random_entities).isdisjoint(pde_fin))):
random_entites=random.sample(unique_entities,5)
print(random_entities,"random_entites")
print(unique_entities)

你能帮我了解出了什么问题吗?

random_entites=random.sample(unique_entities,5)有两个问题:

  • 首先,有一个错别字,你写的是random_entites而不是random_entities.
  • 其次,您从unique_entities中抽取 5 个元素的样本,恰好总共只包含 5 个元素。因此样本总是包含元素'1001',一个也在pde_fin中的元素。

这是该程序的工作版本,其中包括一些其他调整:

import random
unique_entities = ['100', '1001', '10001', '100001', '11111']
pde_fin = ['2151', '2146', '2153', '2135', '2158', '2160', '2137', '2169', '2147', '2015', '2022', '2173', '2028',
'2014', '2018', '2009', '1140', '1085', '1136', '1132', '1007', '1080', '1078', '1131', '1106', '1164',
'1092', '1108', '1118', '1045', '1051', '1006', '1001']
sample_size = 3
random_entities = set(random.sample(unique_entities, sample_size))
print(f"{random_entities=}")
while not random_entities.isdisjoint(pde_fin):
random_entities = set(random.sample(unique_entities, sample_size))
print(f"{random_entities=}")
print(f"Result: {random_entities}")

您可以在进行采样之前过滤unique_entities。在数学上,之前或之后的过滤在随机性方面是相同的。

unique_entities=['100','1001','10001','100001','11111']
pde_fin= ['2151', '2146', '2153', '2135', '2158', '2160', '2137', '2169', '2147', '2015', '2022', '2173', '2028', '2014', '2018', '2009', '1140', '1085', '1136', '1132', '1007', '1080', '1078', '1131', '1106', '1164', '1092', '1108', '1118', '1045', '1051', '1006','1001']
unique_entities_unique = [i for i in unique_entities if not i in pde_fin]
random_entities=random.sample(unique_entities_unique,3)
print(random_entities,"random_entites")

最新更新