这是一个矩阵：

matrix = [[1, 1, 1, 0], 
[0, 5, 0, 1], 
[2, 1, 3, 10]]

我想将所有位置低于0 的元素更改为 0(在同一列上)。

生成的矩阵将是：

matrix = [[1, 1, 1, 0], 
[0, 5, 0, 0], 
[0, 1, 0, 0]]

到目前为止，我试过了。返回为空

import numpy as np
def transform(matrix):
newmatrix = np.asarray(matrix)
i = 0
j = 0
for j in range(0,len(matrix[0])-1):
while i < int(len(matrix))-1 and j < int(len(matrix[0]))-1:
if newmatrix[i][j] == 0:
np.put(newmatrix,newmatrix[i+1][j], 0 )
i +=1
return print (newmatrix)

方法 1(原始)

import numpy as np
def transform(matrix):
mat = np.asarray(matrix)
mat[np.logical_not(np.not_equal(mat, 0).cumprod(axis=0))] = 0
# Alternatively:
# mat[~(mat != 0).cumprod(axis=0, dtype=np.bool)] = 0
# or,
# mat[~((mat != 0).cumprod(axis=0, dtype=np.bool))] = 0
return mat

然后使用您的示例数据，我得到以下mat：

In [195]: matrix = [[1, 1, 1, 0], 
...:           [0, 5, 0, 1], 
...:           [2, 1, 3, 10]]
In [196]: transform(matrix)
Out[196]: 
array([[1, 1, 1, 0],
[0, 5, 0, 0],
[0, 1, 0, 0]])

方法 2(进一步优化)

def transform2(matrix):
mat = np.asarray(matrix)
mat *= (mat != 0).cumprod(axis=0, dtype=np.bool)
return mat

方法3(甚至更加优化)

def transform3(matrix):
mat = np.asarray(matrix)
mat *= mat.cumprod(axis=0, dtype=np.bool)
return mat

解释

让我们看一下主语句(在方法 1 中)：

mat[np.logical_not(np.not_equal(mat, 0).cumprod(axis=0))] = 0

我们可以将其拆分为几个"基本"操作：

创建一个布尔掩码，其中包含False(数字0)，其中mat的元素0，True(数字1)不为零：
```
mask1 = np.not_equal(mat, 0)
```
使用数值False为 0 的事实，使用cumprod()函数(一个很好的解释可以在这里找到：https://www.mathworks.com/help/matlab/ref/cumprod.html)
```
mask2 = mask1.cumprod(axis=0)
```
由于1*1==1和0*0或0*1是0，这个"面具"的所有元素要么是0的，要么是1的。它们将仅在mask1为零且低于(！) 的位置0，因为产品沿列的"累积性质"(因此axis=0)。
现在，我们要mask2中mat中对应于0的那些元素设置为0.为此，我们创建一个布尔掩码，该掩码Truemask20并False其他地方。这可以通过应用逻辑(或二进制)NOT 来轻松实现mask2：
```
mask3 = np.logical_not(mask2)
```
在这里使用"逻辑"NOT 会创建一个布尔数组，因此我们避免显式类型转换。
最后，我们使用布尔索引来选择那些需要设置为0的mat元素，并将它们设置为0：
```
mat[mask3] = 0
```

可选优化

如果您想到它，如果我们执行以下操作，我们可以摆脱步骤 3 和 4：

mask2 = mask1.cumprod(axis=0, dtype=np.bool) #convert result to boolean type 
mat *= mask2 # combined step 3&4

有关完整的实现，请参阅上面的"方法 2"部分。

性能

还有其他几个答案使用numpy.ufunc.accumulate().从根本上说，所有这些方法都围绕着这样一个想法，即0是一个"特殊"值，从某种意义上说，0*anything==0，或者在@DSM的答案中，False=0<True=0并让numpy数组执行"累积"操作。

性能有一些变化，但大多数都很小，除了我的方法#1比其他方法慢。

以下是更多功能的一些时序测试。注意：为了正确执行测试，我们需要使用大型数组。小阵列测试将测量开销、兑现等。

In [1]: import sys
...: import numpy as np
...: 
In [2]: print(sys.version)
...: 
3.6.2 |Continuum Analytics, Inc.| (default, Jul 20 2017, 13:14:59) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
In [3]: print(np.__version__)
...: 
1.12.1
In [4]: # Method 1 (Original)
...: def transform1(matrix):
...:     mat = np.asarray(matrix)
...:     mat[np.logical_not(np.not_equal(mat, 0).cumprod(axis=0))] = 0
...:     return mat
...: 
In [5]: # Method 2:
...: def transform2(matrix):
...:     mat = np.asarray(matrix)
...:     mat *= (mat != 0).cumprod(axis=0, dtype=np.bool)
...:     return mat
...: 
In [6]: # @DSM method:
...: def transform_DSM(matrix):
...:     mat = np.asarray(matrix)
...:     mat *= np.minimum.accumulate(mat != 0)
...:     return mat
...: 
In [7]: # @DanielF method:
...: def transform_DanielF(matrix):
...:     mat = np.asarray(matrix)
...:     mat[~np.logical_and.accumulate(mat, axis = 0)] = 0
...:     return mat
...: 
In [8]: # Optimized @DanielF method:
...: def transform_DanielF_optimized(matrix):
...:     mat = np.asarray(matrix)
...:     mat *= np.logical_and.accumulate(mat, dtype=np.bool)
...:     return mat
...: 
In [9]: matrix = np.random.randint(0, 20000, (20000, 20000))
In [10]: %timeit -n1 transform1(matrix)
22.1 s ± 241 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [11]: %timeit -n1 transform2(matrix)
9.29 s ± 185 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [12]: %timeit -n1 transform3(matrix)
9.23 s ± 180 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [13]: %timeit -n1 transform_DSM(matrix)
9.24 s ± 195 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [14]: %timeit -n1 transform_DanielF(matrix)
10.3 s ± 219 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [15]: %timeit -n1 transform_DanielF_optimized(matrix)
9.27 s ± 187 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

我的初始解决方案(方法 1)最慢，而其他方法要快得多。由于使用了布尔索引@DanielF原始方法的速度稍慢(但优化的变体与其他优化方法一样快)。

这是一个简单(虽然没有优化)的算法：

import numpy as np
from numba import jit
m = np.array([[1, 1, 1, 0], 
[0, 5, 0, 1], 
[2, 1, 3, 10]])
@jit(nopython=True)
def zeroer(m):
a, b = m.shape
for j in range(b):
for i in range(a):
if m[i, j] == 0:
m[i:, j] = 0
break
return m
zeroer(m)
# [[1 1 1 0]
#  [0 5 0 0]
#  [0 1 0 0]]

cumprod方法的一种变体是使用累积最小值(或最大值)。我更喜欢这个，因为如果你愿意，你可以用它来避免任何无法比较的算术运算，尽管很难对此感到厌烦：

In [37]:  m
Out[37]: 
array([[ 1,  1,  1,  0],
[ 0,  5,  0,  1],
[ 2,  1,  3, 10]])
In [38]: m * np.minimum.accumulate(m != 0)
Out[38]: 
array([[1, 1, 1, 0],
[0, 5, 0, 0],
[0, 1, 0, 0]])
In [39]: np.where(np.minimum.accumulate(m != 0), m, 0)
Out[39]: 
array([[1, 1, 1, 0],
[0, 5, 0, 0],
[0, 1, 0, 0]])

使用整数的np.logical_and.accumulate和隐式布尔转换(不需要大量乘法)@AGNGazer解决方案的更优化版本

def transform(matrix):
mat = np.asarray(matrix)
mat[~np.logical_and.accumulate(mat, axis = 0)] = 0
return mat
transform(m)
Out:
array([[1, 1, 1, 0],
[0, 5, 0, 0],
[0, 1, 0, 0]])

计时：

%timeit transform2(m) # AGN's solution
The slowest run took 44.73 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 9.93 µs per loop
%timeit transform(m)
The slowest run took 9.00 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 3: 7.99 µs per loop
m = np.random.randint(0,5,(100,100))
%timeit transform(m)
The slowest run took 6.03 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 43.9 µs per loop
%timeit transform2(m)
The slowest run took 4.09 times longer than the fastest. This could mean that an intermediate result is being cached.
10000 loops, best of 3: 50.4 µs per loop

看起来大约是 15% 的加速。

将矩阵中位置低于 0 的所有元素转换为 0 (Python)

方法 1(原始)

方法 2(进一步优化)

方法3(甚至更加优化)

解释

可选优化

性能

相关内容

最新更新

热门标签：