python中令人尴尬的并行问题



我有634*.npy文件,每个文件都包含一个2D numpy形状数组(8194,76(。我想在每列上使用STL分解五次,频率不同。所以我想做的是:

for file in files:
for column in columns:
for freq in frequencies:
res = STL(file[:,column], period = freq)
decomposed = np.vstack((res.trend, res.seasonal, res.resid)).T
np.save(decompoesd)

最后分解的形状应该是(81941140(。我如何将其并行化?因为在串行实现中运行需要2个多月的时间。

您可以这样做:

from concurrent.futures import ProcessPoolExecutor

FILES = ["a", "b", "c", "d", "e", "f", "g", "h"]

def simulate_cpu_bound(file):
2 ** 100000000  # cpu heavy task
# or just use time.sleep(n), where n - number of seconds
return file

if __name__ == '__main__':
with ProcessPoolExecutor(8) as f:
res = f.map(simulate_cpu_bound, FILES)
res = list(res)
print(res)

相关内容

  • 没有找到相关文章

最新更新