我目前遇到以下问题。我有一个method_A()
,它在给定的字符串集合A1
上循环。对于每一个字符串,我必须执行另一个method_B()
,再次返回一组B*
的字符串。所有返回的集合B*
和集合A
应该合并成一个新的集合results
,因为集合B*
可以有某些字符串的副本。
我现在想让我的method_A()
更快通过使用多处理而不是循环。因此,我想同时对A
集合中的所有字符串执行method_B()
。
下面是我当前代码的示例:
# Method A that takes in a set of strings and returns the merged set of all sets B*
def method_A(set_A):
# Initialize empty set to store results
results = set()
# Loop over each string in set A
for string in set_A:
# Execute method B
set_B = method_B(string)
# Merge set B into results set
results = results.union(set_B)
# Return the final results set
return results
# Method B that takes in a string and returns a set of strings
def method_B(string):
# Perform some operations on the string to generate a set of strings
set_B = # Generated set of strings
# Return the generated set
return set_B
我从来没有使用过多处理,但通过谷歌我的问题,我发现这是一个可能的解决方案,使我的脚本更快。我试图在ChatGPT的帮助下实现它,但我总是遇到这样的问题,即我的结果集要么是空的,要么多处理根本不起作用。也许多线程更适合这种情况,但我不确定。总的来说,我想让我的method_A
更快。我愿意接受任何能让它更快的解决方案!
你可以这样替换你的for
循环:
添加import concurrent.futures
with concurrent.futures.ProcessPoolExecutor() as executor:
for set_B in executor.map(method_B, set_A):
results = results.union(set_B)
这将创建一个子进程池,每个子进程运行自己的python解释器。
executor.map(methodB, set_A)
表示:对于set_A
中的每个元素,执行method_B
method_B
将在子进程中执行,对method_B
的多个调用将并行执行。
将值传递给子进程并获得返回值是由executor
透明地处理的。
更多细节可以在Python的文档中找到:concurrent.futures
要用线程解决这个问题,看起来像这样:
from threading import Thread
# Method A that takes in a set of strings and returns the merged set of all sets B*
def method_A(set_A):
# Initialize empty set to store results
results = set()
threads = []
# Start new thread for each string in set A
for string in set_A:
t = Thread(target=method_B, args=(string, results))
t.start()
threads.append(t)
# Wait for all threads to finish
for t in threads:
t.join()
# Return the final results set
return results
# Method B that takes in a string and returns a set of strings
def method_B(string, results):
# Perform some operations on the string to generate a set of strings
set_B = # Generated set of strings
# Update results dict with the generated set
results = results.union(set_B)
注意,线程不返回函数的值,所以你可以把你的字典传递给线程并在那里编辑它。希望这对你有帮助!