numpy中同一数组与同一数组之间的项差

我有一个数字数组a。

rnd = np.random.default_rng(12345)
a = rnd.uniform(0, -50, 5)
# array([-11.36680112, -15.83791699, -39.86827287, -33.81273354,
#       -19.55547753])

我想找出数组与同一数组中每个元素的差异。示例输出将是:

[array([ 0.        ,  4.47111586, 28.50147174, 22.44593241,  8.18867641]),
array([-4.47111586,  0.        , 24.03035588, 17.97481655,  3.71756054]),
array([-28.50147174, -24.03035588,   0.        ,  -6.05553933,
-20.31279534]),
array([-22.44593241, -17.97481655,   6.05553933,   0.        ,
-14.25725601]),
array([-8.18867641, -3.71756054, 20.31279534, 14.25725601,  0.        ])]

我的第一种方法是使用列表推导式[i - a for i in a]。但是，由于我的原始数组a非常大，并且我有数千个这样的a需要执行相同的操作，因此整个过程变得非常缓慢并且内存消耗非常大，以至于jupyter内核死亡。

有没有可能的方法可以加快这个速度?

最简单的方法是使用广播:

import numpy as np
rnd = np.random.default_rng(12345)
a = rnd.uniform(0, -50, 5)
a[:, None] - a

输出:

array([[  0.        ,   4.47111586,  28.50147174,  22.44593241,
8.18867641],
[ -4.47111586,   0.        ,  24.03035588,  17.97481655,
3.71756054],
[-28.50147174, -24.03035588,   0.        ,  -6.05553933,
-20.31279534],
[-22.44593241, -17.97481655,   6.05553933,   0.        ,
-14.25725601],
[ -8.18867641,  -3.71756054,  20.31279534,  14.25725601,
0.        ]])

有两种方法可以做到，一种是只使用numpy向量

内存效率低(在这种情况下numpy更快)。但如果数组大小较小

a[:, None] - a

使用numba + numpy，它有llvm优化，所以它可以在速度上做魔法，你也可以用parallel = True选项来平方你的速度。对于超大数组，这应该是go到。或c++

对于40000大小，在没有并行性的情况下，这在3秒内完成，在我的12核机器上，在并行性

import numpy as np
import numba as nb
rnd = np.random.default_rng(12345)
a = rnd.uniform(0, -50, 5)
# return type nb.float64[:, :]
# input argument type nb.float64[:, :]
# By specifying these you can do eager compilation instead of lazy
# also you can add parallel = True, cache=True
# if you are using python threading then nogil=True
# you can do lots of stuff
# numba has SIMD vectorization, which just means it shall not loose to numpy on performance grounds if coded properly
@nb.njit(nb.float64[:, :](nb.float64[:]))
def speed(a):
# empty to prevent unnecessary initializations
b = np.empty((a.shape[0], a.shape[0]), dtype=a.dtype)
# nb.prange needed to tell numba this for loop can be parallelized
for i in nb.prange(a.shape[0]):
for j in range(a.shape[0]):
b[i][j] = a[i] - a[j]
return b
speed(a)

import numpy as np
import numba as nb
import sys
import time

@nb.njit(nb.float64[:, :](nb.float64[:]))
def f1(a):
b = np.empty((a.shape[0], a.shape[0]), dtype=a.dtype)
for i in nb.prange(a.shape[0]):
for j in range(a.shape[0]):
b[i][j] = a[i] - a[j]
return b
@nb.njit(nb.float64[:, :](nb.float64[:]), parallel=True, cache=True)
def f2(a):
b = np.empty((a.shape[0], a.shape[0]), dtype=a.dtype)
for i in nb.prange(a.shape[0]):
for j in range(a.shape[0]):
b[i][j] = a[i] - a[j]
return b
def f3(a):
return a[:, None] - a
if __name__ == '__main__':
s0 = time.time()
rnd = np.random.default_rng(12345)
a = rnd.uniform(0, -50, int(sys.argv[2]))
b = eval(sys.argv[1] + '(a)')
print(time.time() - s0)

(base) xxx:~$ python test.py f1 40000
3.0324509143829346
(base) xxx:~$ python test.py f2 40000
0.6196465492248535
(base) xxx:~$ python test.py f3 40000
2.4126882553100586

我面临着类似的限制，我需要一些快速的东西。通过解决内存使用和numba，我在没有并行的情况下获得了大约50倍的速度为什么是np。假设和np。减去。外的速度很快吗?

相关内容

最新更新

热门标签：