我有一个包含2d数组的字典。我试图通过这种方式计算平均值,但它不起作用,因为数组也包含nan值。有没有更简单的方法来计算均值呢?
All = np.zeros(385000).reshape(550,700)
for i in dic.keys():
a = dic[i]['data']
avg = (All+a)/len(dic.keys())
上面的答案绝对是好的,但是np.dstack((a,b))
可能看起来不是很直接或直观。我们也可以使用np.stack()
,使显示更加直观。
a=np.array([[2,np.nan],[5,4]])
b=np.array([[np.nan,3],[7,2]])
c=np.stack((a,b),axis=0)
print(a)
print('='*50)
print(b)
print('='*50)
print(c)
print('='*50)
print(np.nanmean(c,axis=0))
输出[[ 2. nan]
[ 5. 4.]]
==================================================
[[nan 3.]
[ 7. 2.]]
==================================================
[[[ 2. nan]
[ 5. 4.]]
[[nan 3.]
[ 7. 2.]]]
==================================================
[[2. 3.]
[6. 3.]]
np.dstack()
和np.stack()
之间的差异可以通过下面我写的例子找到。
dr1=np.array([[1,2,3],[4,5,6],[7,8,9]])
print(dr1)
dr2=np.array([[9,8,7],[6,5,4],[3,2,1]])
print(dr2)
print('='*50)
dr3=np.dstack((dr1,dr2))
print(dr3.shape)
print(dr3)
print(np.sum(dr3,axis=2)) # This will be (row,col,time) but display (col,time) => (row,col) in each row, the 1 in dr2 will be in [3,3,2] => 3 [3,2]
print('='*50)
dr4=np.stack((dr1,dr2),axis=0) # This will be (time,row,col) and display (row,col) => (row,col) in each time, the 1 in dr2 will be in [2,3,3] => 2 [3,3]
print(dr4.shape)
print(dr4)
print(np.sum(dr4,axis=0))
输出[[1 2 3]
[4 5 6]
[7 8 9]]
[[9 8 7]
[6 5 4]
[3 2 1]]
==================================================
(3, 3, 2)
[[[1 9]
[2 8]
[3 7]]
[[4 6]
[5 5]
[6 4]]
[[7 3]
[8 2]
[9 1]]]
[[10 10 10]
[10 10 10]
[10 10 10]]
==================================================
(2, 3, 3)
[[[1 2 3]
[4 5 6]
[7 8 9]]
[[9 8 7]
[6 5 4]
[3 2 1]]]
[[10 10 10]
[10 10 10]
[10 10 10]]
似乎你正试图找到在两个输入a
和b
中考虑元素的平均值,忽略NaNs
。因此,一种方法是将这两个数组与np.dstack
堆叠,这将沿着第三个轴堆叠a
和b
,然后简单地沿着同一轴使用np.nanmean
。因此,我们将有一个简单的实现,如下-
np.nanmean(np.dstack((a,b)),axis=2)
示例运行-
In [28]: a
Out[28]:
array([[ 2., nan],
[ 5., 4.]])
In [29]: b
Out[29]:
array([[ nan, 3.],
[ 7., 2.]])
In [30]: np.nanmean(np.dstack((a,b)),axis=2)
Out[30]:
array([[ 2., 3.],
[ 6., 3.]])
对于从字典中获取2D
数组的情况,如问题发布的代码所示,您可以使用循环推导来收集这些数组作为np.dstack
的3D
数组,最后使用np.nanmean
沿着最后一个轴,像这样-
np.nanmean(np.dstack([d['data'] for d in dic]),axis=2)