我使用python线程。线程生成线程,对在os.walk()中找到的每个文件名执行一个小实用程序并获得其输出。我尝试使用以下命令限制线程数:
ThreadLimiter = threading.BoundedSemaphore(3)
和
ThreadLimiter.acquire()
在开始运行方法和
ThreadLimiter.release()
在运行结束方法
但是当我运行python程序时,我仍然得到下面的错误消息。有什么改进的建议吗?
bash: fork: retry: Resource temporarily unavailable
bash: fork: retry: Resource temporarily unavailable
使用线程池可以节省很多工作!下面是md5sum文件:
import os
import multiprocessing.pool
import subprocess as subp
def walker(path):
"""Walk the file system returning file names"""
for dirpath, dirs, files in os.walk(path):
for fn in files:
yield os.path.join(dirpath, fn)
def worker(filename):
"""get md5 sum of file"""
p = subp.Popen(['md5sum', filename], stdin=subp.PIPE,
stdout=subp.PIPE, stderr=subp.PIPE)
out, err = p.communicate()
return filename, p.returncode, out, err
pool = multiprocessing.pool.ThreadPool(3)
for filename, returncode, out, err in pool.imap(worker, walker('.'), chunksize=1):
print(filename, out.strip())