python2和python3多重访问.进程问题



我试图了解多处理模块中python2和python3之间发生了什么变化。在python2上运行此代码就像一个魅力:

def RunPrice(items, price):
print("There is %s items, price is: %s" % (items, price))
def GetTargetItemsAndPrice(cursor):
res = cursor.execute("SELECT DISTINCT items, price FROM SELLS")
threads = []
for row in res.fetchall():
p = multiprocessing.Process(target=RunPrice, args=(row[0],row[1]))
threads.append(p)
p.start()
for proc in threads:
proc.join()

假设在SELLS中有2000个条目需要处理。在python2上,此脚本按预期运行并退出。在蟒蛇3上,我得到了一个:

File "/usr/lib/python3.8/multiprocessing/popen_fork.py", line 69, in _launch
child_r, parent_w = os.pipe()
OSError: [Errno 24] Too many open files

知道蟒蛇2和蟒蛇3之间发生了什么吗?

我假设您实际的RunPrice函数比您显示的要占用更多的CPU。否则,这将不是多处理的好候选者。如果RunPrice非常占用CPU,并且不放弃CPU来等待I/O完成,那么当您认为创建进程不是一项特别便宜的操作时(尽管肯定没有在Windows上运行时那么昂贵(,拥有一个进程数超过CPU核心数的处理池将是不有利的。

from multiprocessing import Pool
def RunPrice(items, price):
print("There is %s items, price is: %s" % (items, price))
def GetTargetItemsAndPrice(cursor):
res = cursor.execute("SELECT DISTINCT items, price FROM SELLS")
rows = res.fetchall()
MAX_POOL_SIZE = 1024
# if RunPrice is very CPU-intensive, it may not pay to have a pool size
# greater than the number of CPU cores you have. In that case:
#from multiprocessing import cpu_count
#MAX_POOL_SIZE = cpu_count()
pool_size = min(MAX_POOL_SIZE, len(rows))
with Pool(pool_size) as pool:
# return values from RunPrice:
results = pool.starmap(RunPrice, [(row[0], row[1]) for row in rows])

最新更新