在python中进行多线程处理时,如何增加变量值



我正在尝试制作一个具有多线程的webscraper,以使其更快。我希望每次执行都能增加价值。但有时该值本身会跳过或重复。

import threading
num = 0
def scan():
while True:
global num
num += 1
print(num)
open('logs.txt','a').write(str(f'{num}n'))
for x in range(500):
threading.Thread(target=scan).start()

结果:

2
2
5
5
7
8
10
10
12
13
13
13
16
17
19
19
22
23
24
25
26
28
29
29
31
32
33
34

预期结果:

1
2
3
4
5
6
7
8
9
10

因此,由于变量num是一个共享资源,您需要锁定它

num_lock = threading.Lock()

每次您想要更新共享变量时,都需要您的线程首先获取锁。一旦获取了锁,只有该线程有权更新num的值,而当当前线程获取了锁时,没有其他线程能够这样做。

确保在执行此操作时使用waittry-finally块,以确保即使当前线程无法更新共享变量,也会释放锁。

类似这样的东西:

num_lock.acquire()
try:
num+=1
finally:
num_lock.release()

使用with:

with num_lock:
num+=1

看起来像是一个竞赛条件。您可以使用锁,这样只有一个线程可以获得特定的数字。使用锁写入输出文件也是有意义的。

这里有一个带锁的例子。当然,您不能保证输出的编写顺序,但每个项目都应该只写一次。在这个例子中,我添加了10000的限制,这样你就可以更容易地检查测试代码中是否最终写入了所有内容,因为否则,无论你在什么时候中断它,都很难验证一个数字是被跳过了,还是它只是在等待一个锁来写入输出。

my_num不是共享的,所以在with num_lock部分中声明了它之后,您可以自由释放该锁(保护共享的num(,然后在with之外继续使用my_num,而其他线程可以访问该锁来声明自己的值。这样可以最大限度地缩短锁定的持续时间。

import threading
num = 0
num_lock = threading.Lock()
file_lock = threading.Lock()    
def scan():
global num_lock, file_lock, num

while num < 10000:
with num_lock:
num += 1
my_num = num
# do whatever you want here using my_num
# but do not touch num
with file_lock:
open('logs.txt','a').write(str(f'{my_num}n'))

threads = [threading.Thread(target=scan) for _ in range(500)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()

除了threading.Lock之外的一个重要调用:

  • 使用join使父线程等待分叉线程完成
  • 如果没有这一点,线程仍将竞争

假设我在线程完成后使用num

import threading
lock, num = threading.Lock(), 0

def operation():
global num
print("Operation has started")
with lock:
num += 1

threads = [threading.Thread(target=operation) for x in range(10)]
for t in threads:
t.start()
for t in threads:
t.join()
print(num)

没有连接,不一致(9打印一次,10打印一次(:

Operation has started
Operation has started
Operation has started
Operation has started
Operation has startedOperation has started
Operation has started
Operation has started
Operation has started
Operation has started9

使用join,其一致性:

Operation has started
Operation has started
Operation has started
Operation has started
Operation has started
Operation has started
Operation has started
Operation has started
Operation has started
Operation has started
10

最新更新