多线程 Python 程序不执行线程函数



我在python中的线程处理方面缺乏经验,并试图制作一些简单的多线程程序来获得更多的经验。我正在尝试将请求发送到预定义的URL列表中。

当试图执行程序时,它会立即完成并打印("结束"(,没有退出或异常。放置在threadfunction中的print调用不会执行,也不会引发任何错误。

如有任何帮助,我们将不胜感激。

import networking
import threading
import concurrent.futures
class concurrencyTest:
def __init__(self, URLlist):
self.URLlist = URLlist
self.resourceDict = {}

self._urlListLock = threading.Lock()
self._resourceListLock = threading.Lock()
def sendMultiThreadedRequests(self, threadNum=3):
self.resourceDict = {}
with concurrent.futures.ThreadPoolExecutor(max_workers=threadNum) as executor:
results = executor.map(self.thread_function)


def thread_function(self):
print("You are are in the thread_function")
while True:
with self._urlListLock:
numOfRemainingURL = len(self.URLlist)
print(numOfRemainingURL)
if numOfRemainingURL == 0:
return
urlToRequest = self.URLlist.pop()
webpage = networking.getWebpage(urlToRequest)
##parse webpage or resource

with self._resourceListLock:
self.resourceDict[urlToRequest] = webpage


def sendRegularRequests(self):
self.resourceDict = {}
for url in self.URLlist:
resource = networking.getWebpage(url)
self.resourceDict[url] = resource
def updateURLpool(self):
return "Not currently coded"


def main():
#The real urlList is a lot larger than just 3 URLs
urlList = ["www.google.com","www.stackoverflow.com","www.reddit.com"]
parTest = concurrencyTest(urlList)
parTest.sendMultiThreadedRequests()

print("End")
main()

executor.map()用于将值列表映射到函数调用,并期望一个可迭代的(例如列表(作为第二个参数(或多个对象作为独立参数(将其内容映射到作为第一个参数提供的函数。

例如:

executor.map(self.thread_function, self.URLlist)

executor.map(self.thread_function, url1, url2, url3, ..., urln)

将为CCD_ 3中的每个值或第二示例中提供的每个参数调用CCD_。

这反过来意味着,函数thread_function()需要接受一个参数才能从列表中获得值:thread_function(self, url)。由于该函数现在一次只能获得URLlist的一个值,因此函数中的while循环不再有意义,您必须重构该函数以仅处理一个url而不是一个列表:

def thread_function(self, url):
webpage = getWebpage(url)
# parse webpage or resource

with self._resourceListLock:
self.resourceDict[url] = webpage

或者,您可以使用submit()而不是map(),其目的只是异步执行一个函数。这样就不需要对thread_function()进行修改:

executor.submit(self.thread_function)

如果要使用concurrent.futures

您从不向.map()传递任何可迭代项,因此不会执行任何操作。为了简化你的东西(你也不需要任何锁(:

import concurrent.futures
import random
import time
import hashlib

def get_data(url):
print(f"Starting to get {url}")
# to pretend doing some work:
time.sleep(random.uniform(0.5, 1))
result = hashlib.sha1(url.encode("utf-8")).hexdigest()  
print(f"OK: {url}")
return (url, result)

url_list = ["www.google.com", "www.stackoverflow.com", "www.reddit.com"]
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
results = {}
for key, value in executor.map(get_data, url_list):
results[key] = value
print(f"Results acquired: {len(results)}")
# or more simply
# results = dict(executor.map(get_data, url_list))
print(results)

打印出(例如,它是随机的(

Starting to get www.google.com
Starting to get www.stackoverflow.com
Starting to get www.reddit.com
OK: www.google.com
Results acquired: 1
OK: www.stackoverflow.com
Results acquired: 2
OK: www.reddit.com
Results acquired: 3
{'www.google.com': 'd8b99f68b208b5453b391cb0c6c3d6a9824f3c3a', 'www.stackoverflow.com': '3954ca3139369180fff4ea3ae984b9a7871b540d', 'www.reddit.com': 'f420470addba27b8577bb40e02229e90af568d69'}

如果要使用multiprocessing

(与上述get_data功能相同(

from multiprocessing.pool import ThreadPool, Pool
# (choose between threads or processes)
with ThreadPool(3) as p:
results = dict(p.imap_unordered(get_data, url_list))
print(results)