从另一个由逻辑数组掩码 (MATLAB) 映射的较小矩阵的值构建稀疏矩阵?



我正在处理大型稀疏矩阵(sparse)。我有一个很大的值的稀疏矩阵它需要包含到一个更大的稀疏矩阵中。我有一个logicals的数组,它表示哪些行和哪些列将被较小的矩阵值填充。在这个应用程序中,两个矩阵中较小的一个是作为邻接矩阵存储的图,逻辑表示节点id在较大矩阵中的位置。

一个小的玩具示例来演示我当前正在做的事情:

zz = sparse(4,4);  %create a sparse matrix of the final size desired
rr = rand(3,3);   %a matrix of values
logical_inds = logical([1 0 1 1]); %a logical array to index with the dimension size of zz
zz(logical_inds,logical_inds) = rr(:,:) %'rr' values are mapped into the subset of 'zz'

我看到zz的2列是零,并且2行也是零值。这是期望的输出。

在我的程序中,

得到"稀疏索引可能很慢"的警告,事实确实如此。有时,当矩阵非常大时,程序在这一行终止。

我如何用稀疏方法创建这个矩阵(zz) ?我不确定如何从我拥有的逻辑掩码中创建行列索引,以及如何将rr的值转换为适合这个新索引的数组。

**通常rr是非常稀疏的,尽管逻辑掩码地址是整个矩阵

我认为这个问题主要是由于在分配过程中隐式调整大小。以下是我认为的原因:

%# test parameters
N  = 5000;               %# Size of 1 dimension of the square sparse
L  = rand(1,N) > 0.95;   %# 5% of rows/cols will be non-zero values
M  = sum(L);            
rr = rand(M);            %# the "data" to fill the sparse with 

%# Method 1: direct logical indexing
%# (your original method)
zz1 = sparse(N,N);    
tic    
    zz1(L,L) = rr;
toc
%# Method 2: test whether the conversion to logical col/row indices matters 
zz2  = sparse(N,N);    
inds = zz1~=0;    
tic        
    zz2(inds) = rr;
toc
%# Method 3: test whether the conversion to linear indices matters 
zz3 = sparse(N,N);
inds = find(inds);
tic        
    zz3(inds) = rr;
toc
%# Method 4: test whether implicit resizing matters    
zz4 = spalloc(N,N, M*M);
tic        
    zz4(inds) = rr;
toc

结果:

Elapsed time is 3.988558 seconds. %# meh   M1 (original)
Elapsed time is 3.916462 seconds. %# meh   M2 (expanded logicals)
Elapsed time is 4.003222 seconds. %# meh   M3 (converted row/col indices)
Elapsed time is 0.139986 seconds. %# WOW!  M4 (pre-allocated memory)

所以显然(和令人惊讶的),似乎MATLAB不会在分配之前增长现有的稀疏(如您所期望的),但实际上循环通过行/col索引并在迭代期间增长稀疏。因此,似乎必须"帮助"MATLAB一点:

%# Create initial sparse
zz1 = sparse(N,N);    
%# ...
%# Do any further operations until you can create rr: 
%# ...
rr = rand(M);            %# the "data" to fill the sparse with 

%# Now that the size of the data is known, re-allocate space for the sparse:
tic
    [i,j] = find(zz1); %# indices
    [m,n] = size(zz1); %# Sparse size (you can also use N of course)
    zz1 = sparse(i,j,nonzeros(zz1), m,n, M*M);
    zz1(L,L) = rr; %# logical or integer indices, doesn't really matter 
toc

结果(对于相同的N, Lrr):

Elapsed time is 0.034950 seconds.  %# order of magnitude faster than even M4!

要用sparse函数创建这个矩阵,逻辑索引将需要转换为行和列索引,所以这可能最终会变慢…

在这里找到逻辑向量中1的位置,然后创建一个矩阵,其中包含稀疏矩阵中非零的行和列索引。
最后,使用稀疏函数在这些位置创建具有rr元素的稀疏矩阵(使用rr(:)将其转换为列向量)

ind_locs = find(logical_inds);
ind = combvec(ind_locs,ind_locs);
zz = sparse(ind(1,:),ind(2,:),rr(:))

最新更新