如何使用许多一维布尔数组索引多维 numpy 数组?



>假设我有一个具有n维的numpy数组A,它可能非常大,并假设我有k1维布尔掩码M1, ..., Mk

我想从A中提取一个n维数组B该数组包含位于索引处的所有A元素,其中所有掩码的"外AND"True

..但我想这样做,而不是首先形成所有掩码的(可能非常大的)"外AND",也不必一次一个轴地从每个轴中提取指定的元素,从而在此过程中创建(可能许多)中间副本。

下面的示例演示了从上述 A 中提取元素的两种方法:

from functools import reduce
import numpy as np

m = 100
for _ in range(m):
n = np.random.randint(0, 10)
k = np.random.randint(0, n + 1)
A_shape = tuple(np.random.randint(0, 10, n))
A = np.random.uniform(-1, 1, A_shape)
M_lst = [np.random.randint(0, 2, dim).astype(bool) for dim in A_shape]
# creating shape of B:
B_shape = tuple(map(np.count_nonzero, M_lst)) + A_shape[len(M_lst):]
# size of B:
B_size = np.prod(B_shape)
# --- USING "OUTER-AND" OF ALL MASKS --- #
# creating "outer-AND" of all masks:
M = reduce(np.bitwise_and, (np.expand_dims(M, tuple(np.r_[:i, i+1:n])) for i, M in enumerate(M_lst)), True)
# extracting elements from A and reshaping to the correct shape:
B1 = A[M].reshape(B_shape)
# checking that the correct number of elements was extracted
assert B1.size == B_size
# THE PROBLEM WITH THIS METHOD IS THE POSSIBLY VERY LARGE OUTER-AND OF ALL THE MASKS!
# --- USING ONE MASK AT A TIME --- #
B2 = A
for i, M in enumerate(M_lst):
B2 = B2[tuple(slice(None) for _ in range(i)) + (M,)]
assert B2.size == np.prod(B_shape)
assert B2.shape == B_shape
# THE PROBLEM WITH THIS METHOD IS THE POSSIBLY LARGE NUMBER OF POSSIBLY LARGE INTERMEDIATE COPIES!
assert np.all(B1 == B2)
# EDIT 1:
# USING np.ix_ AS SUGGESTED BY Chrysophylaxs
i = np.ix_(*M_lst)
B3 = A[i]
assert B3.shape == B_shape
assert B3.size == B_size
assert np.prod(list(map(np.size, i))) == B_size
print(f'All three methods worked all {m} times')

有没有更智能(更有效)的方法可以做到这一点,可能使用现有的numpy函数?

IIUC,您正在寻找np.ix_;一个例子:

import numpy as np
arr = np.arange(60).reshape(3, 4, 5)
x = [True, False, True]
y = [False, True, True, False]
z = [False, True, False, True, False]
out = arr[np.ix_(x, y, z)]

外:

array([[[ 6,  8],
[11, 13]],
[[46, 48],
[51, 53]]])