我有一个二进制数组...我想要的是能够从每一行中挑选特定百分比的...... 假设每行 100 个,我想从第一行随机返回 20%,从第二行随机返回 10%, 40% 来自第 3 名,30% 来自第 4 名(当然总共 100%(。
0| 00000000001000000010000000000000000000001000000100000000000000000000000000000001 ...
1| 00000000000000010000000000001000000000000100000000000000000000000000000000000000 ...
2| 00000000000000000000000000000010010000000000000000000000000000010000100000000000 ...
3| 01000000000000100000000000000000000000001000100000000000000010000000000000000000 ...
这很容易,只需在每一行上做random.choice(one_idxs,%(。问题是目标位数也必须是 100 位...... 即,如果某些位重叠并且随机选择它们,则总数将不同于 100 位。
另外,在每一行上,它应该尝试选择以前未选择的位,至少作为一个选项!
任何想法
例如,我用于简单情况的代码(它不考虑所选索引是否跨行重复,仅在一行内(:
for every row :
ones_count = 100
bits_cnt = int(ones_count * probs[i])
idxs = array.get_row(i).one_idxs()
selected = np.random.choice(idxs, size=bits_cnt, replace=False)
我只需要选择那些..这就是我使用索引的原因
为了方便起见,使用字符串列表而不是位数组并获取 4 个样本...
In [39]: data = ['10000101',
...: '11110000',
...: '00011000']
In [40]: idxs = random.sample(range(len(data[0])), 4)
In [41]: # 20% row 1, 30% row 2, 50% row 3
In [42]: row_selections = random.choices(range(len(data)), [0.2, 0.3, 0.5], k=len(idxs))
In [43]: idxs
Out[43]: [7, 3, 1, 4]
In [44]: row_selections
Out[44]: [0, 2, 0, 1]
In [45]: picks = [ data[r][c] for (r, c) in zip(row_selections, idxs)]
In [46]: picks
Out[46]: ['1', '1', '0', '0']
好的,根据您的评论,这应该可以更好地作为如何仅从每个列表/数组中按比例选择一个的示例:
import random
a1= '10001010111110001101010101'
a2= '00101010001011010010100010'
a1 = [int(t) for t in a1]
a2 = [int(t) for t in a2]
a1_one_locations= [idx for idx, v in enumerate(a1) if v==1]
a2_one_locations= [idx for idx, v in enumerate(a2) if v==1]
# lists of indices where 1 exists in each list...
print(a1_one_locations)
print(a2_one_locations)
n_samples = 6 # total desired
# 40% from a1, remainder from a2
a1_samples = int(n_samples * 0.4)
a2_samples = n_samples - a1_samples
a1_picks = random.sample(a1_one_locations, a1_samples)
a2_picks = random.sample(a2_one_locations, a2_samples)
# print results
print('indices from a1: ', a1_picks)
print('indices from a2: ', a2_picks)
输出:
[0, 4, 6, 8, 9, 10, 11, 12, 16, 17, 19, 21, 23, 25]
[2, 4, 6, 10, 12, 13, 15, 18, 20, 24]
indices from a1: [6, 21]
indices from a2: [10, 15, 4, 20]