如何与ThreadPoolExecutor并行运行代码

嗨，我对线程真的很陌生，这让我很困惑，我如何并行运行这段代码？

def search_posts(page):
page_url = f'https://jsonplaceholder.typicode.com/posts/{page}'
req = requests.get(page_url)
res = req.json()

title = res['title']

return title

page = 1
while True:
with ThreadPoolExecutor() as executer:
t = executer.submit(search_posts, page)
title = t.result()
print(title)
if page == 20:
break
page += 1

另一个问题是，我是否需要学习操作系统才能理解线程是如何工作的？

这里的问题是为每个页面创建一个新的ThreadPoolExecutor。要并行执行操作，只创建一个ThreadPoolExecutor并使用其map方法：

import concurrent.futures as cf
import requests

def search_posts(page):
page_url = f'https://jsonplaceholder.typicode.com/posts/{page}'
res = requests.get(page_url).json()
return res['title']

if __name__ == '__main__':
with cf.ThreadPoolExecutor() as ex: 
results = ex.map(search_posts, range(1, 21))
for r in results:
print(r)

请注意，使用if __name__ == '__main__'包装器是使代码更加可移植的一个好习惯。

使用线程时需要记住的一件事；如果您使用的是CPython(python.org中的Python实现，这是最常见的实现(，那么线程实际上并不是并行运行的。

为了降低内存管理的复杂性，一次只能有一个线程在CPython中执行Python字节码。这是由CPython中的全局解释器锁("GIL"(强制执行的。

好消息是，使用requests获取网页将花费大部分时间使用网络I/O。一般来说，GIL是在I/O期间发布的。

但是，如果您在工作函数中进行计算(即执行Python字节码(，则应该使用ProcessPoolExecutor。

如果使用ProcessPoolExecutor并且在ms窗口上运行，则需要使用if __name__ == '__main__'包装器，因为在这种情况下，Python必须能够在没有副作用的情况下import主程序。

相关内容

最新更新

热门标签：