如何在Python中使用多线程抓取过程读取和写入文件?获取Windows错误32



我正试图使用文本文件中的参数来抓取链接,并将结果写入csv文件。但是当我尝试用多线程实现它时,我得到了错误::

WindowsError: [Error 32] The process cannot access the file because it is being used by   another process:    
'c:\users\appdata\local\temp\tmpqseulj.webdriver.xpi\components\wdIStatus.xpt'

请帮助解决此问题。内联是代码

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
import unittest, time, re
from threading import Thread
import urlparse
import urllib2
import sys;
import csv
import operator
reload(sys);
sys.setdefaultencoding("utf8")

with open("C:\Test2.csv", "w") as f:
     fieldnames = ("SearchQuery", "Title")
     output = csv.writer(f, delimiter=",")
     output.writerow(fieldnames)

def th(ur):    
    driver = webdriver.Firefox()
    driver.get("https://www.google.com/search?q="+ur)
    time.sleep(20);
    html_source = driver.page_source
    regex = '<span class="label">(.*?)</span>'
    pattern = re.compile(regex)
   Cluster = re.findall(pattern, html_source)
   Cluster = [H.replace("All Topics","") for H in Cluster]
   Cluster = [H.replace("Other topics","") for H in Cluster]
   Cluster = filter(operator.methodcaller('strip'), Cluster)
   print ur, str(Cluster)
   output.writerow([ur, HotelName]) 
   driver.close();

Symbolfile = open("Result.txt")
Symbollist = Symbolfile.read()
new = Symbollist.split("n")

threadlist = []
for u in new:                               # thread implementation
    t = Thread(target=th, args=(u,))
    t.start()
    threadlist.append(t)
for b in threadlist:
    b.join()

如果多个线程将写入同一文件,则需要使用锁。

这看起来是一个合理的例子:http://www.laurentluce.com/posts/python-threads-synchronization-locks-rlocks-semaphores-conditions-events-and-queues/

"打印"也不是线程安全的。