寻找更快的方法来处理细胞和矢量运算



我有一个单元格列表,每个元素都包含不同数量的坐标来访问向量。例如,

C ={ [1 2 3] , [4 5],  [6], [1 8 9 12 20]}

这只是一个例子,在实际情况下,C的大小为10^4到10^6,每个元素包含1到1000个元素的向量。我需要使用每个元素作为坐标来访问向量中的相应元素。我正在使用一个循环来找到由单元元素指定的向量元素的平均值

 for n=1:size(C,1)
   x = mean(X(C{n}));
   % put x to somewhere  
 end

这里X是10000个元素的大矢量。使用循环是可以的,但我想知道是否有任何方法可以做同样的事情,但不使用循环?我问的原因是上面的代码需要运行这么多次,现在使用lopp相当慢。

方法#1

C_num = char(C{:})-0; %// 2D numeric array from C with cells of lesser elements 
             %// being filled with 32, which is the ascii equivalent of space
mask = C_num==32; %// get mask for the spaces
C_num(mask)=1; %// replace the numbers in those spaces with ones, so that we 
                %// can index into x witout throwing any out-of-extent error
X_array = X(C_num); %// 2D array obtained after indexing into X with C_num
X_array(mask) = nan; %// set the earlier invalid space indices with nans
x = nanmean(X_array,2); %// final output of mean values neglecting the nans

方法#2

lens = cellfun('length',C); %// Lengths of each cell in C
maxlens = max(lens); %// max of those lengths
%// Create a mask array with no. of rows as maxlens and columns as no. of cells. 
%// In each column, we would put numbers from each cell starting from top until
%// the number of elements in that cell. The ones(true) in this mask would be the 
%// ones where those numbers are to be put and zeros(false) otherwise.
mask = bsxfun(@le,[1:maxlens]',lens) ; %//'
C_num = ones(maxlens,numel(lens)); %// An array where the numbers from C are to be put
C_num(mask) = [C{:}]; %// Put those numbers from C in C_num.
  %// NOTE: For performance you can also try out: double(sprintf('%s',C{:}))
X_array = X(C_num); %// Get the corresponding X elements
X_array(mask==0) = nan; %// Set the invalid locations to be NaNs
x = nanmean(X_array); %// Get the desired output of mean values for each cell

方法#3

这与方法#2几乎相同,但在最后进行了一些更改以避免nanmean

因此,编辑从方法#2到这些-的最后两行

X_array(mask1==0) = 0;
x = sum(X_array)./lens;

最新更新