PyCurl请求在执行时无限挂起



我已经编写了一个脚本,从每周运行的Qualys中获取扫描结果,用于收集指标。

这个脚本的第一部分涉及为过去一周运行的每个扫描获取一个引用列表,以供进一步处理。

问题是,虽然有时这会很好地工作,但其他时候脚本会挂在c.perform()行上。当手动运行脚本时,这是可以管理的,因为它可以重新运行,直到它工作为止。然而,我希望将此作为每周的计划任务运行,而不需要任何手动交互。

有没有一种万无一失的方法可以检测是否发生了挂起,并重新发送PyCurl请求,直到它工作为止?

我尝试过设置c.TIMEOUTc.CONNECTTIMEOUT选项,但这些选项似乎没有效果。此外,由于没有抛出任何异常,所以简单地将其放入除块之外的尝试中也不会成功。

有问题的功能如下:

# Retrieve a list of all scans conducted in the past week
# Save this to refs_raw.txt
def getScanRefs(usr, pwd):
print("getting scan references...")
with open('refs_raw.txt','wb') as refsraw: 
today = DT.date.today()
week_ago = today - DT.timedelta(days=7)
strtoday = str(today)
strweek_ago = str(week_ago)
c = pycurl.Curl()
c.setopt(c.URL, 'https://qualysapi.qualys.eu/api/2.0/fo/scan/?action=list&launched_after_datetime=' + strweek_ago + '&launched_before_datetime=' + strtoday)
c.setopt(c.HTTPHEADER, ['X-Requested-With: pycurl', 'Content-Type: text/xml'])
c.setopt(c.USERPWD, usr + ':' + pwd)
c.setopt(c.POST, 1)
c.setopt(c.PROXY, 'companyproxy.net:8080')
c.setopt(c.CAINFO, certifi.where())
c.setopt(c.SSL_VERIFYPEER, 0)
c.setopt(c.SSL_VERIFYHOST, 0)
c.setopt(c.CONNECTTIMEOUT, 3)
c.setopt(c.TIMEOUT, 3)
refsbuffer = BytesIO()
c.setopt(c.WRITEDATA, refsbuffer)
c.perform()
body = refsbuffer.getvalue()
refsraw.write(body)
c.close()
print("Got em!")

我自己解决了这个问题,使用multiprocessing启动一个单独的进程,在一个单独进程中启动API调用,如果持续时间超过5秒,则终止并重新启动。它不是很漂亮,但是跨平台的。对于那些正在寻找更优雅但仅适用于*nix的解决方案的人,请查看信号库,特别是SIGALRM。

以下代码:

# As this request for scan references sometimes hangs it will be run in a separate thread here
# This will be terminated and relaunched if no response is received within 5 seconds
def performRequest(usr, pwd):
today = DT.date.today()
week_ago = today - DT.timedelta(days=7)
strtoday = str(today)
strweek_ago = str(week_ago)
c = pycurl.Curl()
c.setopt(c.URL, 'https://qualysapi.qualys.eu/api/2.0/fo/scan/?action=list&launched_after_datetime=' + strweek_ago + '&launched_before_datetime=' + strtoday)
c.setopt(c.HTTPHEADER, ['X-Requested-With: pycurl', 'Content-Type: text/xml'])
c.setopt(c.USERPWD, usr + ':' + pwd)
c.setopt(c.POST, 1)
c.setopt(c.PROXY, 'companyproxy.net:8080')
c.setopt(c.CAINFO, certifi.where())
c.setopt(c.SSL_VERIFYPEER, 0)
c.setopt(c.SSL_VERIFYHOST, 0)
refsBuffer = BytesIO()
c.setopt(c.WRITEDATA, refsBuffer)
c.perform()
c.close()
body = refsBuffer.getvalue()
refsraw = open('refs_raw.txt', 'wb')
refsraw.write(body)
refsraw.close()
# Retrieve a list of all scans conducted in the past week
# Save this to refs_raw.txt
def getScanRefs(usr, pwd):
print("Getting scan references...") 
# Occasionally the request will hang infinitely. Launch in separate method and retry if no response in 5 seconds
success = False
while success != True:
sendRequest = multiprocessing.Process(target=performRequest, args=(usr, pwd))
sendRequest.start()
for seconds in range(5):
print("...")
time.sleep(1)
if sendRequest.is_alive():
print("Maximum allocated time reached... Resending request")
sendRequest.terminate()
del sendRequest
else:
success = True
print("Got em!")

这个问题很老了,但我会添加这个答案,它可能会对某人有所帮助。

在执行"循环"之后终止正在运行的卷曲的唯一方法;perform()";是通过使用回调:

1-使用CURLOPT_WRITEFUNCTION:如文件所述:

回调应该返回实际处理的字节数。如果这个数量与传递给回调函数的数量不同,它将向库发出错误条件的信号。这将导致传输中止,并且使用的libcurl函数将返回CURLE_WRITE_ERROR。

这个方法的缺点是curl只有在从服务器接收到新数据时才会调用写函数,所以在服务器停止发送数据的情况下,curl只会在服务器端等待,不会收到你的终止信号

2-到目前为止最好的选择是使用进度回调:

进度回调的美妙之处在于,即使没有来自服务器的数据,curl也会每秒至少调用一次,这将使您有机会返回非零值作为curl 的终止开关

使用选项CURLOPT_XFERINFOFUNCTION,注意,它比使用文档中引用的CURLOPT_PROGRESSFUNCTION要好:

如果可以的话,我们鼓励用户使用更新的CURLOPT_XFERINFOFUNCTION。

您还需要设置选项CURLOPT_NOPROGRESS

必须将CURLOPT_NOPROGRESS设置为0才能实际调用此函数。

这是一个示例,向您展示python中的编写和进度函数实现:

# example of using write and progress function to terminate curl
import pycurl
open('mynewfile', 'w') as f  # used to save downloaded data
counter = 0
# define callback functions which will be used by curl
def my_write_func(data):
"""write to file"""
f.write(data)
counter += len(data)
# an example to terminate curl: tell curl to abort if the downloaded data exceeded 1024 byte by returning -1 or any number 
# not equal to len(data) 
if counter >= 1024:
return -1
def progress(*data):
"""it receives progress from curl and can be used as a kill switch
Returning a non-zero value from this callback will cause curl to abort the transfer
"""
d_size, downloaded, u_size, uploade = data
# an example to terminate curl: tell curl to abort if the downloaded data exceeded 1024 byte by returning non zero value 
if downloaded >= 1024:
return -1

# initialize curl object and options
c = pycurl.Curl()
# callback options
c.setopt(pycurl.WRITEFUNCTION, my_write_func)
self.c.setopt(pycurl.NOPROGRESS, 0)  # required to use a progress function
self.c.setopt(pycurl.XFERINFOFUNCTION, self.progress) 
# self.c.setopt(pycurl.PROGRESSFUNCTION, self.progress)  # you can use this option but pycurl.XFERINFOFUNCTION is recommended
# put other curl options as required
# executing curl
c.perform()

最新更新