索引多维数组

我知道多维numpy数组可以与其他数组索引，但我不明白下面的工作原理:

我想有raster的项目，一个3d numpy数组，基于indx，一个3d索引数组:

raster=np.random.rand(5,10,50)
indx=np.random.randint(0, high=50, size=(5,10,3))

我想要的是另一个维度为indx的数组，它根据indx的索引保存raster的值。

为了在广播期间正确解析索引，我们需要两个数组a和b，以便raster[a[i,j,k],b[i,j,k],indx[i,j,k]]将在indx的轴的相应范围内为i, j, k的raster[i,j,indx[i,j,k]]。最简单的解决方案是:

x,y,z = indx.shape
a,b,_ = np.ogrid[:x,:y,:z]
raster[a,b,indx]

其中np.ogrid[...]创建了形状为(x,1,1), (1,y,1)和(1,1,z)的三个数组。我们不需要最后一个，所以我们把它扔掉了。现在，当其他两个用indx广播时，它们的行为完全符合我们的要求。

如果我正确理解了这个问题，对于indx的每一行，您都试图索引到raster中的相应行，但是列号根据indx中的实际值而变化。因此，在此假设下，您可以使用使用线性索引的矢量化方法，如下-

M,N,R = raster.shape
linear_indx = R*np.arange(M*N)[:,None] + indx.reshape(M*N,-1)
out = raster.ravel()[linear_indx].reshape(indx.shape)

我假设你想从每个三维数组中获得3个随机值。

由于高级索引

，您可以通过列表推导来实现这一点

下面是一个使用较少数量的值和整数的示例，以便输出更容易阅读:

import numpy as np
raster=np.random.randint(0, high=1000, size=(2,3,10))
indices=np.random.randint(0, high=10, size=(2,3,3))
results = np.array([ np.array([ column[col_indices] for (column, col_indices) in zip(row, row_indices) ]) for (row, row_indices) in zip(raster, indices) ])
print("Raster:")
print(raster)
print("Indices:")
print(indices)
print("Results:")
print(results)

输出:

Raster:
[[[864 353  11  69 973 475 962 181 246 385]
  [ 54 735 871 218 143 651 159 259 785 383]
  [532 476 113 888 554 587 786 172 798 232]]
 [[891 263  24 310 652 955 305 470 665 893]
  [260 649 466 712 229 474   1 382 269 502]
  [323 513  16 236 594 347 129  94 256 478]]]
Indices:
[[[0 1 2]
  [7 5 1]
  [7 8 9]]
 [[4 0 2]
  [6 1 4]
  [3 9 2]]]
Results:
[[[864 353  11]
  [259 651 735]
  [172 798 232]]
 [[652 891  24]
  [  1 649 229]
  [236 478  16]]]

它同时迭代栅格和索引中相应的三维数组，并使用高级索引从栅格中切片所需的索引。

这里是一个更详细的版本，它做了完全相同的事情:

results = []
for i in range(len(raster)):
    row = raster[i]
    row_indices = indices[i]
    row_results = []
    for j in range(len(row)):
        column = row[j]
        column_indices = row_indices[j]
        column_results = column[column_indices]
        row_results.append(column_results)
    results.append(np.array(row_results))
results = np.array(results)

相关内容

最新更新

热门标签：