如何锁定正在进行的dict密钥



我一直在做一个小的"迷你项目;在那里我希望能够";锁定";正在生成值的dict键。

我目前已经写:

import random
import threading
import time
test_dict = {}
random_proxies = [
"http://test.io:12345",
"http://test.io:123456",
"http://test.io:1234567",
"http://test.io:12345678"
]    
def random_challenge():
print("LOCKING PROXY!!!")
time.sleep(random.randint(5, 20))    
def do_proxies():
while True:
proxy = random.choice(random_proxies)
# Mock data that it is hitting challenge / If a proxy is locked then retry with another proxy in the meanwhile
if bool(random.getrandbits(1)):
# Lock that proxy key for not being accessible to other threads
random_challenge()    
test_dict[proxy] = {
"value1": "blabla",
"valu2": "blablabla"
}
# Unlock the proxy to be accessible again
print(test_dict)
time.sleep(1)    
if __name__ == '__main__':
for _ in range(3):
threading.Thread(target=do_proxies).start()

这个脚本的想法是,我将一直运行3个线程,正如您在一开始看到的那样。我试图做的是,我想不断地调用代理变量,看看我是否没有得到"代理变量";受到挑战";这是if bool(random.getrandbits(1)):的模拟数据,如果它是真的,那么它是一个将休眠5,20秒的挑战,这里的问题是,当前可能发生的是,同一代理可能在挑战的同时被击中

我试图实现的是,如果代理处于挑战中,那么其他线程都不应该能够访问该代理,并且如果该代理已经处于挑战中则应该尝试使用另一个代理。但我不知道如何才能做出";锁定";功能,我在这里,

我的问题是:如何锁定处于挑战中的特定代理,然后在完成时解锁它?

我的想法很简单,它有点像

from threading import Thread, Lock
import time
list_op_proxy = ['proxy_1', 'proxy_2', 'proxy_3']
proxy_dict = dict(zip(list_op_proxy, ['available']*len(list_op_proxy)))
lock = Lock()
def handler(func):
def wrapper(name, lock):
global proxy_dict
i=0
while(True):
each_proxy = list_op_proxy[i]
lock.acquire()
if(proxy_dict[each_proxy] == 'available'):
proxy_dict[each_proxy] = 'busy'
lock.release()
print(f'acquired {each_proxy} by {name}')
break
else:
lock.release()
i += 1
i %= len(list_op_proxy)
func(name, lock)
lock.acquire()
proxy_dict[each_proxy] = 'available'
lock.release()

return wrapper
@handler
def process(name, lock):
print('called', name)
time.sleep(4)
print(f'{name} finished')

Thread(target=process, args=('Thread 1',lock)).start()
Thread(target=process, args=('Thread 2',lock)).start()
Thread(target=process, args=('Thread 3',lock)).start()
Thread(target=process, args=('Thread 4',lock)).start()
Thread(target=process, args=('Thread 5',lock)).start()
calledcalled Thread 1
acquired proxy_1 by Thread 1
Thread 2
acquired proxy_2 by Thread 2
called Thread 3
acquired proxy_3 by Thread 3
calledcalled Thread 5
Thread 4
Thread 3 finished
Thread 1 finished
acquired proxy_3 by Thread 5Thread 2 finished
acquired proxy_2 by Thread 4
Thread 5 finishedThread 4 finished

如果线程/代理的数量不相同,您可以先洗牌并获取代理,而不保存其状态:

import random
import threading
import time
from itertools import cycle

random_proxies = [
"http://test.io:12345",
"http://test.io:123456",
"http://test.io:1234567",
"http://test.io:12345678"
]    

lck = threading.Lock()
random.shuffle(random_proxies)
proxy_list = cycle(random_proxies)
def do_proxies(threadNum):
lck.acquire()
proxy = next(proxy_list)
print(f"Thread {threadNum} got proxy: {proxy}")
lck.release()
time.sleep(1)

if __name__ == '__main__':
THREADS = 3
for x in range(99):
print(f"ROUND: {x+1}")
threads = [threading.Thread(target=do_proxies, args=(i, )) for i in range(THREADS)]
[t.start() for t in threads]
[t.join() for t in threads]

输出:

ROUND: 1
Thread 0 got proxy: http://test.io:1234567
Thread 1 got proxy: http://test.io:12345
Thread 2 got proxy: http://test.io:123456
ROUND: 2
Thread 0 got proxy: http://test.io:12345678
Thread 1 got proxy: http://test.io:1234567
Thread 2 got proxy: http://test.io:12345
ROUND: 3
Thread 0 got proxy: http://test.io:123456
Thread 1 got proxy: http://test.io:12345678
Thread 2 got proxy: http://test.io:1234567
...

最新更新