我如何用Python中的多处理/多线程取代循环?



我目前遇到以下问题。我有一个method_A(),它在给定的字符串集合A1上循环。对于每一个字符串,我必须执行另一个method_B(),再次返回一组B*的字符串。所有返回的集合B*和集合A应该合并成一个新的集合results,因为集合B*可以有某些字符串的副本。

我现在想让我的method_A()更快通过使用多处理而不是循环。因此,我想同时对A集合中的所有字符串执行method_B()

下面是我当前代码的示例:

# Method A that takes in a set of strings and returns the merged set of all sets B*
def method_A(set_A):
# Initialize empty set to store results
results = set()

# Loop over each string in set A
for string in set_A:
# Execute method B
set_B = method_B(string)

# Merge set B into results set
results = results.union(set_B)

# Return the final results set
return results
# Method B that takes in a string and returns a set of strings
def method_B(string):
# Perform some operations on the string to generate a set of strings
set_B = # Generated set of strings

# Return the generated set
return set_B

我从来没有使用过多处理,但通过谷歌我的问题,我发现这是一个可能的解决方案,使我的脚本更快。我试图在ChatGPT的帮助下实现它,但我总是遇到这样的问题,即我的结果集要么是空的,要么多处理根本不起作用。也许多线程更适合这种情况,但我不确定。总的来说,我想让我的method_A更快。我愿意接受任何能让它更快的解决方案!

如果你能帮忙,我很高兴!

你可以这样替换你的for循环:

添加import concurrent.futures

with concurrent.futures.ProcessPoolExecutor() as executor:
for set_B in executor.map(method_B, set_A):
results = results.union(set_B)

这将创建一个子进程池,每个子进程运行自己的python解释器。

executor.map(methodB, set_A)表示:对于set_A中的每个元素,执行method_B

method_B将在子进程中执行,对method_B的多个调用将并行执行。

将值传递给子进程并获得返回值是由executor透明地处理的。

更多细节可以在Python的文档中找到:concurrent.futures

要用线程解决这个问题,看起来像这样:

from threading import Thread
# Method A that takes in a set of strings and returns the merged set of all sets B*
def method_A(set_A):
# Initialize empty set to store results
results = set()
threads = []

# Start new thread for each string in set A
for string in set_A:
t = Thread(target=method_B, args=(string, results))
t.start()
threads.append(t)
# Wait for all threads to finish
for t in threads:
t.join()

# Return the final results set
return results
# Method B that takes in a string and returns a set of strings
def method_B(string, results):
# Perform some operations on the string to generate a set of strings
set_B = # Generated set of strings

# Update results dict with the generated set
results = results.union(set_B)

注意,线程不返回函数的值,所以你可以把你的字典传递给线程并在那里编辑它。希望这对你有帮助!

最新更新