我正试图编写一个python代码,绘制数字列表的数字密度,显示输入中有多少数字位于数字行的哪个间隔内。在输入"0"中;states.dat";文件,我有
3.5
3.6
和我写的脚本
import numpy as np
import re
import multiprocessing as mp
cpu_read=open('density.pbs', "r")
for line in cpu_read:
if re.search("select", line):
cpu = int(re.split('=|:',line)[5])
print("total number of cpu used for parallelization:", cpu)
step = 0.1
start = 3.5
end = 4.50
states = [line.rstrip('n') for line in open('states.dat')]
states_array=np.array(states).astype(np.float)
nsteps = int((end - start) / step)
print("nstepsn", nsteps)
final = np.zeros((nsteps+1 ,2), dtype=float)
for i in range (nsteps+1):
final[(i,0)] = start + i*step
def final_DOS(i):
print("iteration",i)
final[(i,1)] = np.count_nonzero((states_array >= final[(i,0)]) & (states_array < final[(i+1,0)]))
return final[(i,1)]
#for i in range(nsteps):
# final_DOS(i)
pool = mp.Pool(cpu)
pool.map(final_DOS(i),[i for i in range (nsteps)])
pool.close()
pool.join()
print("completed final arrayn", final)
np.savetxt('DOS.txt',final,fmt='%5.5f', delimiter=' ')
print("done!")
当我不使用多处理时,即使用时,能够给我正确的输出
pool = mp.Pool(cpu)
pool.map(final_DOS(i),[i for i in range (nsteps)])
pool.close()
pool.join()
评论和
#for i in range(nsteps):
# final_DOS(i)
取消注释。这将给出输出
total number of cpu used for parallelization: 2
states_array
[3.5 3.6]
nsteps
10
iteration 0
iteration 1
iteration 2
iteration 3
iteration 4
iteration 5
iteration 6
iteration 7
iteration 8
iteration 9
completed final array
[[3.5 1. ]
[3.6 1. ]
[3.7 0. ]
[3.8 0. ]
[3.9 0. ]
[4. 0. ]
[4.1 0. ]
[4.2 0. ]
[4.3 0. ]
[4.4 0. ]
[4.5 0. ]]
done!
然而,当我在打开多处理的情况下运行脚本时,正如这里的完整代码中所示,我会得到以下输出和错误消息:
total number of cpu used for parallelization: 24
nsteps
10000
tail: density.out: file truncated
total number of cpu used for parallelization: 2
states_array
[3.5 3.6]
nsteps
10
final array
[[3.5 0. ]
[3.6 0. ]
[3.7 0. ]
[3.8 0. ]
[3.9 0. ]
[4. 0. ]
[4.1 0. ]
[4.2 0. ]
[4.3 0. ]
[4.4 0. ]
[4.5 0. ]]
iteration 10
Traceback (most recent call last):
File "density.py", line 41, in <module>
pool.map(final_DOS(i),[i for i in range (nsteps)])
File "density.py", line 34, in final_DOS
final[(i,1)] = np.count_nonzero((states_array >= final[(i,0)]) & (states_array < final[(i+1,0)]))
IndexError: index 11 is out of bounds for axis 0 with size 11
我不明白为什么当启用多处理时,多处理最终会在最大值为9的范围(nsteps(内得到I=10。知道为什么吗?
在不关闭文件的小点旁边(使用context manager
(。
with open('density.pbs', "r") as cpu_read:
# your parsing
# Now the file is closed here.
您没有将函数传递给map
参数,而是告诉函数final_DOS
使用值i
运行。这里使用的是:
for i in range(nsteps + 1):
final[(i, 0)] = start + i * step
最后的答案是10,因此当你点击以下行时,i
被定义为10:
pool.map(final_DOS(i), [i for i in range(nsteps)])
在导致11的final_DOS
函数中,您将再次增加i
。
要使用map方法,请注意,必须传入未初始化的函数。在任何情况下,final_DOS
(注意没有括号。(