如何在不使用double for循环的情况下加快矩阵的构造



我想加速以下双for循环:

n = 10000
A = np.zeros((n, n))
twopi = 2.0 * np.pi
for i in range(0, n):
for j in range(0, n):
A[i,j] = twopi * i*j/n

我该怎么做?我的想法是:计算一个包含从0n-1的所有元素的向量v

v = np.arange(n) 

然后构造两个矩阵IJ,一个具有等于v的所有行,另一个具有与v相等的所有列。如果我没有错的话,那么我可以用代替双for循环

A = twopi * I*J/n

这是正确的吗?如何构建IJ?有更好的方法吗?

类似这样的东西:

from numpy import outer, pi, arange
n = 10
i = arange(n)
A = 2 * pi / n * outer(i, i)

outer产品满足您的要求。

编辑:测试性能的简单方法如下:

from time import time
n = 10000
t = time()
for _ in range(10):
i = arange(n)
A = 2 * pi / n * outer(i, i)
t = time() - t
print("Time used [s]", t)

对我来说,这10次重复需要5秒,rangearange之间没有显著差异。

答案

你确实可以使用numpy的广播能力。

import numpy as np
n = 10000
twopi = 2.0 * np.pi / n
A = np.arange(n) * np.arange(n).reshape(-1, 1) * twopi

第二个CCD_ 16的重塑推动了产品的传播。

这与V博士调用np.outer()非常相似,即使从时间性能的角度来看也是如此。

基准

配置:

import numpy as np
n = 10000
twopi = 2 * np.pi / n

次数:

# List comprehension
A = np.array([   
[twopi*i*j for j in range(n)] 
for i in range(n)])
# timeit > 10.2 s ± 129 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
# Outer
i = np.arange(n)
A = twopi * np.outer(i, i)
# timeit > 234 ms ± 6.87 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
# Product
A = np.arange(n) * np.arange(n).reshape(-1, 1) * twopi
# timeit > 208 ms ± 41.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

如前所述,外部方法和产品方法非常相似。但列表理解并不是因为它违背了numpy的目的,numpy是通过分布式子任务(通常还有低级优化(来向量化数组计算。

您可以使用列表综合:

A = np.array([   
[twopi*i*j/n for j in range(n)] 
for i in range(n)])

默认情况下,range从零开始,因此range(0,n)而不是range(n)是多余的。

In [118]: matrix = np.array([np.arange(n) for _ in range(n)])
In [119]: matrix
Out[119]:
array([[   0,    1,    2, ..., 9997, 9998, 9999],
[   0,    1,    2, ..., 9997, 9998, 9999],
[   0,    1,    2, ..., 9997, 9998, 9999],
...,
[   0,    1,    2, ..., 9997, 9998, 9999],
[   0,    1,    2, ..., 9997, 9998, 9999],
[   0,    1,    2, ..., 9997, 9998, 9999]])
In [120]: matrix [0,1]
Out[120]: 1
In [121]: for j in range(n): matrix[:,j] *= j
In [122]: matrix
Out[122]:
array([[       0,        1,        4, ..., 99940009, 99960004, 99980001],
[       0,        1,        4, ..., 99940009, 99960004, 99980001],
[       0,        1,        4, ..., 99940009, 99960004, 99980001],
...,
[       0,        1,        4, ..., 99940009, 99960004, 99980001],
[       0,        1,        4, ..., 99940009, 99960004, 99980001],
[       0,        1,        4, ..., 99940009, 99960004, 99980001]])
In [123]: matrix = matrix/n
In [124]: matrix = matrix * np.pi*2
In [125]: matrix
Out[125]:
array([[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
...,
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04],
[0.00000000e+00, 6.28318531e-04, 2.51327412e-03, ...,
6.27941596e+04, 6.28067228e+04, 6.28192873e+04]])

与脚本相同:

matrix = np.array([np.arange(n) for _ in range(n)])
for j in range(n): matrix[:,j] *= j
matrix = matrix/n
matrix = matrix * np.pi*2

最新更新