使用列表编制索引时如何加快"for"循环?(蟒蛇)



我试图通过使用Numpy函数或向量而不是for循环来加速此代码:

sommes = []
for j in range(vertices.shape[0]):
terme = new_vertices[j] - new_vertices[vertex_neighbors[j]]
somme_j = np.sum(terme)
sommes.append(somme_j)
E_int = np.sum(sommes)

(这是迭代算法的一部分,有很多"顶点",所以我认为for循环需要太长时间。(

例如,为了计算";术语";当j=0时,我有:

In: new_vertices[0]
Out: array([ 10.2533888 , -42.32279717,  68.27230793])
In: vertex_neighbors[0]
Out: [1280, 2, 1511, 511, 1727, 1887, 759, 509, 1023]
In: new_vertices[vertex_neighbors[0]]
Out: array([[ 10.47121043, -42.00123956,  68.218715  ],
[ 10.2533888 , -43.26905874,  62.59473849],
[ 10.69773735, -41.26464083,  68.09594854],
[ 10.37030712, -42.16729601,  68.24639107],
[ 10.12158146, -42.46624547,  68.29621598],
[  9.81850836, -42.71158695,  68.33710623],
[  9.97615447, -42.59625943,  68.31788497],
[ 10.37030712, -43.11676015,  62.54960623],
[ 10.55512696, -41.82622703,  68.18954624]])
In: new_vertices[0] - new_vertices[vertex_neighbors[0]]
Out: array([[-0.21782162, -0.32155761,  0.05359293],
[ 0.        ,  0.94626157,  5.67756944],
[-0.44434855, -1.05815634,  0.17635939],
[-0.11691832, -0.15550116,  0.02591686],
[ 0.13180734,  0.1434483 , -0.02390805],
[ 0.43488044,  0.38878979, -0.0647983 ],
[ 0.27723434,  0.27346227, -0.04557704],
[-0.11691832,  0.79396298,  5.7227017 ],
[-0.30173816, -0.49657014,  0.08276169]])

问题是new_vertices[vertex_neighbors[j]]并不总是具有相同的大小。例如,当j=7:时

In: new_vertices[7]
Out: array([ 10.74106112, -63.88592276, -70.15593947])
In: vertex_neighbors[7]
Out: [1546, 655, 306, 1879, 920, 925]
In: new_vertices[vertex_neighbors[7]]
Out: array([[  9.71830698, -69.07323638, -83.10229623],
[ 10.71123017, -64.06983438, -70.09345104],
[  9.74836003, -68.88820555, -83.16187474],
[ 10.78982867, -63.70552665, -70.2169896 ],
[  9.74627177, -60.87823935, -60.13032811],
[  9.79419242, -60.69528267, -60.182843  ]])
In: new_vertices[7] - new_vertices[vertex_neighbors[7]]
Out: array([[  1.02275414,   5.18731363,  12.94635676],
[  0.02983095,   0.18391163,  -0.06248843],
[  0.99270108,   5.0022828 ,  13.00593527],
[ -0.04876756,  -0.18039611,   0.06105013],
[  0.99478934,  -3.00768341, -10.02561137],
[  0.94686869,  -3.19064009,  -9.97309648]])

没有for循环可能吗?我的想法快用完了,所以任何帮助都将不胜感激!

谢谢。

,这是可能的。其想法是使用np.repeat来创建一个向量,其中项目重复可变次数。这是代码:

# The two following lines can be done only once if the indices are constant between iterations (precomputation)
counts = np.array([len(e) for e in vertex_neighbors])
flatten_indices = np.concatenate(vertex_neighbors)
E_int = np.sum(np.repeat(new_vertices, counts, axis=0) - new_vertices[flatten_indices])

这里有一个基准:

import numpy as np
from time import *

n = 32768
vertices = np.random.rand(n, 3)
indices = []
count = np.random.randint(1, 10, size=n)
for i in range(n):
indices.append(np.random.randint(0, n, size=count[i]))
def initial_version(vertices, vertex_neighbors):
sommes = []
for j in range(vertices.shape[0]):
terme = vertices[j] - vertices[vertex_neighbors[j]]
somme_j = np.sum(terme)
sommes.append(somme_j)
return np.sum(sommes)
def optimized_version(vertices, vertex_neighbors):
# The two following lines can be precomputed
counts = np.array([len(e) for e in indices])
flatten_indices = np.concatenate(indices)
return np.sum(np.repeat(vertices, counts, axis=0) - vertices[flatten_indices])
def more_optimized_version(vertices, vertex_neighbors, counts, flatten_indices):
return np.sum(np.repeat(vertices, counts, axis=0) - vertices[flatten_indices])
timesteps = 20
a = time()
for t in range(timesteps):
res = initial_version(vertices, indices)
b = time()
print("V1: time:", b - a)
print("V1: result", res)
a = time()
for t in range(timesteps):
res = optimized_version(vertices, indices)
b = time()
print("V2: time:", b - a)
print("V2: result", res)
a = time()
counts = np.array([len(e) for e in indices])
flatten_indices = np.concatenate(indices)
for t in range(timesteps):
res = more_optimized_version(vertices, indices, counts, flatten_indices)
b = time()
print("V3: time:", b - a)
print("V3: result", res)

这是我的机器上的基准测试结果:

V1: time: 3.656714916229248
V1: result -395.8416223057596
V2: time: 0.19800186157226562
V2: result -395.8416223057595
V3: time: 0.07983255386352539
V3: result -395.8416223057595

正如您所看到的,这个优化的版本比引用实现快18倍,预计算索引的版本比参考实现快46倍。

请注意,优化版本应该需要更多的RAM(尤其是在每个顶点的邻居数量很大的情况下(。

最新更新