我想在Matlab中对1
和n
之间的m
整数进行采样,而不替换,其中
m=10^6;
p=13^5;
n=p*(p-1)/2;
我已经尝试使用randsample
如下
random_indices_pairs=randsample(n,m);
然而,我遇到了一个内存问题,那就是
Error using zeros
Requested 1x68929060278 (513.6GB) array exceeds maximum array size preference. Creation of arrays greater than this
limit may take a long time and cause MATLAB to become unresponsive. See array size limit or preference panel for more
information.
Error in randsample (line 149)
x = zeros(1,n); % flags
有办法避免这种情况吗?这里的问题是由于n
是巨大的。
randperm
的两个输入版本等效于randsample
,没有替换,并且没有内存问题:
random_indices_pairs = randperm(n, m);
下面的脚本应该能满足您的需求。
- 它首先选取
1
到n
范围内的m
随机整数 - 然后检查是否有重复的条目
- 如果没有,脚本将停止
- 如果存在重复条目:
- 它通过了所有这些
- 在
1
和n
之间找到另一个随机数 - 检查整数数组中是否存在新的随机数
- 如果是,它会找到另一个随机数
- 如果没有,它将替换数组中的重复项,并移动到下一个重复项
%% Initialize
clearvars;
clc;
m = 10e6;
p = 13e5;
n = p*(p-1)/2;
%% Create m random integers between 1 and n
randomInt = randi(n, m, 1);
%% Find indices where duplicate random integers are
% Find indices of unique values, take the index of the first occurrence
[~, I] = unique(randomInt, 'first');
% Generate an array of all indices
dupIdx = 1:length(randomInt);
% Drop indices which point to the first occurrence of the duplicate
% This leaves indices that point to the duplicate
dupIdx(I) = [];
% Free up some memory
clear I;
if isempty(dupIdx)
disp('Done!')
else
% For those indices find another random number, not yet in randomInt
disp('Found duplicates, finding new random numbers for those')
counter = 0;
for ii = dupIdx
counter = counter + 1;
disp(strcat("Resolving duplicate ", num2str(counter), "/", num2str(length(dupIdx))))
dupe = true;
% While the replacement is already in the randomInt array, keep
% looking for a replacement
while dupe
replacement = randi(n, 1);
if ~ismember(replacement, randomInt)
% When replacement is unique in randomInt
% Put replacement in the randomInt array at the right index
randomInt(ii) = replacement;
dupe = false;
end
end
end
end
基于其中一条注释(该注释还建议了可能的改进(。
A=randi(n,m,1);
[U, I] = unique(A, 'stable');
A=A(I);
m_to_add=m-size(A,1);
while m_to_add>0
B=randi(n,m_to_add,1);
A=[A;B];
[U, I] = unique(A, 'stable');
A=A(I);
m_to_add=m-size(A,1);
end