numpy 的基本操作是否矢量化，即它们是否使用 SIMD 操作？

我正在做一些性能分析，我想知道，当数据类型已知(double)时，numpy是否对其标准数组操作进行矢量化。

a, b = (some numpy arrays)
c = a + b #Is this vectorized?

编辑：此操作是否矢量化，即计算是否由 SIMD 操作组成？

是的，他们是。

/*
* This file is for the definitions of simd vectorized operations.
*
* Currently contains sse2 functions that are built on amd64, x32 or
* non-generic builds (CFLAGS=-march=...)
* In future it may contain other instruction sets like AVX or NEON     detected
* at runtime in which case it needs to be included indirectly via a file
* compiled with special options (or use gcc target attributes) so the binary
* stays portable.
*/

链接： Numpy simd.inc.src on github.

我注意到Quazi Irfan对Henrikstroem的回答有评论，他说Numpy没有利用矢量化，并引用了一个博客，其中作者通过实验进行了"证明"。

所以我浏览了博客，发现有一个差距可能会得出不同的结论：对于 Numpy-array A 和 B，算术 A*B 与 NP.dot(A，B) 不同.博客作者测试的算术(A*B) 只是标量乘法，而不是矩阵乘法(np.dot(a，b))，甚至不是向量内积.但作者仍然使用 a*b 与运行 np.dot(a， a，b).这两种算术的复杂性是如此不同！

numpy 当然利用了 SIMD 和 BLAS 矢量化，可以在其源代码中找到.官方 numpy 发行版支持一组并行操作(如 np.dot)，但不是每个函数(如 np.where，np.mean).博客作者可能会选择一个不合适的函数(非矢量化函数)进行比较。

我们还可以看到，在多核CPU使用率方面，当执行numpy.dot()时，所有内核都执行高使用率。因此，numpy必须经过矢量化(通过BLAS)以避免由于CPython的GIL限制而仅使用单个内核。

相关内容

最新更新

热门标签：