具有轴行为的 Python numpy sum 函数



我是Python和numpy的新手,所以我只是运行示例代码并尝试调整它们以进行理解。我遇到了一些关于numpy.sum的代码,带有axis参数,但我无法让它运行。一段时间后(阅读scipy文档,尝试实验(,我通过使用axis = (1,2,3)而不是axis = 1来运行它。

问题是,无论我搜索哪里,他们都只写axis = 1来让它工作。

我正在使用 Python 3.5.3,带有 numpy 1.12.1有没有一个numpy/python版本在行为上有很大的差异?还是我只是以某种方式配置错误?

import numpy as np
from past.builtins import xrange

# sample data
X = np.arange(1, 4*4*3*5+1).reshape(5, 4, 4, 3)
Y = np.arange(5, 4*4*3*8+5).reshape(8, 4, 4, 3)
Xlen = X.shape[0]
Ylen = Y.shape[0]
# allocate some space for whatever calculation
rs = np.zeros((Xlen, Ylen))
rs1 = np.zeros((Xlen, Ylen))
# calculate the result with 2 loops
for i in xrange(Xlen):
    for j in xrange(Ylen):
        rs[i, j] = np.sum(X[i] + Y[j])
# calculate the result with one loop only
for i in xrange(Xlen):
    rs1[i, :] = np.sum(Y + X[i], axis=(1,2,3))
print(rs1 == rs) # same result
# also with one loop, as everywhere on the internet:
for i in xrange(Xlen):
    rs1[i, :] = np.sum(Y + X[i], axis=1)
    # ValueError: could not broadcast input array from shape (8,4,3) into shape (8)
axis : None or int or tuple of ints, optional
    ...
    If axis is a tuple of ints, a sum is performed on all of the axes
    specified in the tuple instead of a single axis or all the axes as
    before.

使用元组的能力是一个补充(v1.7,2013(。 我没有经常使用它,当我在 MATLAB 中需要它时,我使用了重复的总和,例如

In [149]: arr = np.arange(24).reshape(2,3,4)
In [150]: arr.sum(axis=(1,2))
Out[150]: array([ 66, 210])
In [151]: arr.sum(axis=2).sum(axis=1)
Out[151]: array([ 66, 210])

在进行顺序求和时,您需要记住轴数会发生变化(除非您使用 keepdims ,否则它本身就是一个新参数(。


您的X,Y总和:

In [160]: rs = np.zeros((Xlen, Ylen),int)
     ...: rs1 = np.zeros((Xlen, Ylen),int)
     ...: 
     ...: # calculate the result with 2 loops
     ...: for i in range(Xlen):
     ...:   for j in range(Ylen):
     ...:     rs[i,j] = np.sum(X[i] + Y[j])
     ...: 
In [161]: rs
Out[161]: 
array([[ 2544,  4848,  7152,  9456, 11760, 14064, 16368, 18672],
       [ 4848,  7152,  9456, 11760, 14064, 16368, 18672, 20976],
       [ 7152,  9456, 11760, 14064, 16368, 18672, 20976, 23280],
       [ 9456, 11760, 14064, 16368, 18672, 20976, 23280, 25584],
       [11760, 14064, 16368, 18672, 20976, 23280, 25584, 27888]])

可以在没有循环的情况下复制。

In [162]: X.sum((1,2,3))
Out[162]: array([ 1176,  3480,  5784,  8088, 10392])
In [163]: Y.sum((1,2,3))
Out[163]: array([ 1368,  3672,  5976,  8280, 10584, 12888, 15192, 17496])
In [164]: X.sum((1,2,3))[:,None] + Y.sum((1,2,3))
Out[164]: 
array([[ 2544,  4848,  7152,  9456, 11760, 14064, 16368, 18672],
       [ 4848,  7152,  9456, 11760, 14064, 16368, 18672, 20976],
       [ 7152,  9456, 11760, 14064, 16368, 18672, 20976, 23280],
       [ 9456, 11760, 14064, 16368, 18672, 20976, 23280, 25584],
       [11760, 14064, 16368, 18672, 20976, 23280, 25584, 27888]])

np.sum(X[i] + Y[j]) => np.sum(X[i]) + np.sum(Y[j]) . sum(X[i])X[i]的所有元素求和(轴=无(。 除了第一个轴之外,所有轴的总和都是相同的,X.sum(axis=(1,2,3))[i] .

In [165]: X[0].sum()
Out[165]: 1176
In [166]: X.sum((1,2,3))[0]
Out[166]: 1176
In [167]: X.sum(1).sum(1).sum(1)[0]
Out[167]: 1176

至于广播错误,看看碎片:

In [168]: rs1[i,:]
Out[168]: array([0, 0, 0, 0, 0, 0, 0, 0])   # shape (8,)
In [169]: (Y+X[i]).shape    # (8,4,4,3) + (4,4,3)
Out[169]: (8, 4, 4, 3)
In [170]: (Y+X[i]).sum(1).shape    # sums axis 1, ie one of the 4's
Out[170]: (8, 4, 3)

为了只写axis=1得到相同的结果,我们可以事先做一个重塑数据集的技巧。

X = np.reshape(X, (X.shape[0], -1))
Y = np.reshape(Y, (Y.shape[0], -1))
for i in xrange(Xlen):
    rs[i, :] = np.sum(Y + X[i], axis=1)
print(rs)

结果:

[[  2544.   4848.   7152.   9456.  11760.  14064.  16368.  18672.]
 [  4848.   7152.   9456.  11760.  14064.  16368.  18672.  20976.]
 [  7152.   9456.  11760.  14064.  16368.  18672.  20976.  23280.]
 [  9456.  11760.  14064.  16368.  18672.  20976.  23280.  25584.]
 [ 11760.  14064.  16368.  18672.  20976.  23280.  25584.  27888.]]

最新更新