标量的奇怪的numpy划分行为

我一直在尝试升级一个库，该库有一系列标量的几何运算，因此它们也可以与numpy数组一起使用。在做这件事的时候，我注意到了一些奇怪的行为，有着愚蠢的分歧。

在原始代码中，如果两个变量都不为零，则检查到变量之间的归一化差，交换到numpy，结果看起来像：

import numpy as np
a = np.array([0, 1, 2, 3, 4])
b = np.array([1, 2, 3, 0, 4])
o = np.zeros(len(a))
o = np.divide(np.subtract(a, b), b, out=o, where=np.logical_and(a != 0, b != 0))
print(f'First implementation: {o}')

对于无法计算的实例，我传入了初始化为零的输出缓冲区；返回：

First implementation: [ 0.         -0.5        -0.33333333  0.          0.        ]

由于out需要一个数组，我不得不为标量稍微修改一下，但它似乎很好。

a = 0
b = 4
o = None if np.isscalar(a) else np.zeros(len(a))
o = np.divide(np.subtract(a, b), b, out=o, where=np.logical_and(b != 0, a != 0))
print(f'Modified out for scalar: {o}')

Modified out for scalar: 0.0.

然后运行了一些测试函数，发现其中很多都失败了。深入研究后，我发现第一次用where设置为False的标量调用除法时，函数会返回零，但如果我再次调用它，第二次它会返回一些不可预测的东西。

a = 0
b = 4
print(f'First divide: {np.divide(b, a, where=False)}')
print(f'Second divide: {np.divide(b, a, where=False)}')

First divide: 0.0
Second divide: 4.0

查看文件，它说"；其中条件为False的位置将保持未初始化"；，所以我猜numpy是一个内部缓冲区，它最初被设置为零，然后它最终携带了一个早期的中间值。

我很难理解如何在有或没有where子句的情况下使用divide；如果我使用where，我会得到不可预测的输出，如果我不使用，我就无法防止被零除。在这些情况下，我是遗漏了什么，还是只需要有一个不同的代码路径？我意识到，使用out变量，我已经走到了另一条代码路径的一半。

如果有任何建议，我将不胜感激。

对我来说，这似乎是一个错误。但我认为，无论如何，出于性能原因，在标量的情况下，你都希望缩短对ufunc的调用，所以这是一个尽量避免太混乱的问题。由于a或b都可以是标量，因此需要同时检查它们。将该检查放入一个有条件返回输出数组或None的函数中，就可以执行

def scalar_test_np_zeros(a, b):
"""Return np.zeros for the length of arguments unless both
arguments are scalar, then None."""
if a_is:=np.isscalar(a) and np.isscalar(b):
return None
else:
return np.zeros(len(a) if a_is else len(b))
a = 0
b = 4
if o := scalar_test_np_zeros(a, b) is None:
o = (a-b)/b if a and b else 0.0
else:
np.divide(np.subtract(a, b), b, out=o, 
where=np.logical_and(b != 0, a != 0))

标量测试在其他有类似问题的代码中也很有用。

值得一提的是，如果我帮助任何人，我得出的结论是，我需要包装np.divide，以便在可以使用数组和标量的函数中安全使用它。这是我的包装功能：

import numpy as np
def divide_where(a, b, where, out=None, fill=0):
""" wraps numpy divide to safely handle where clause for both arrays and scalars
- a: dividend array or scalar
- b: divisor array or scalar
- where: locations where is True a/b will be set
- out: location where data is written to; if None, an output array will be created using fill value
- fill: defines fill value. if scalar and where True value will used; if out not set fill value is used creating output array
"""
if (a_is_scalar := np.isscalar(a)) and np.isscalar(b):
return fill if not where else a / b
if out is None:
out = np.full_like(b if a_is_scalar else a, fill)
return np.divide(a, b, out=out, where=where)

相关内容

最新更新

热门标签：