将Scrapy与谷歌云存储一起用作提要导出的问题



根据报废文档,我使用GCS作为报废的提要导出。奇怪的是,它有时确实有效。

但其他时候它会在上传时失败,我唯一能看到的不同之处是它试图上传更多的数据。话虽如此,它仍然以约60Mb的上传失败,这让我怀疑数据的规模是否真的是一个问题。有人能解释一下这是我的配置问题还是Scrapy本身的问题吗?错误报告如下:

2020-12-01 23:07:26 [scrapy.extensions.feedexport] ERROR: Error storing csv feed (19826 items) in: gs://instoxi_amazon/com/Ngolo/Amazon_Beauty_&_Personal_Care_Ngolo.csv
Traceback (most recent call last):
File "C:ProgramDataAnaconda3libsite-packagesurllib3connectionpool.py", line 600, in urlopen
chunked=chunked)
File "C:ProgramDataAnaconda3libsite-packagesurllib3connectionpool.py", line 354, in _make_request
conn.request(method, url, **httplib_request_kw)
File "C:ProgramDataAnaconda3libhttpclient.py", line 1244, in request
self._send_request(method, url, body, headers, encode_chunked)
File "C:ProgramDataAnaconda3libhttpclient.py", line 1290, in _send_request
self.endheaders(body, encode_chunked=encode_chunked)
File "C:ProgramDataAnaconda3libhttpclient.py", line 1239, in endheaders
self._send_output(message_body, encode_chunked=encode_chunked)
File "C:ProgramDataAnaconda3libhttpclient.py", line 1065, in _send_output
self.send(chunk)
File "C:ProgramDataAnaconda3libhttpclient.py", line 987, in send
self.sock.sendall(data)
File "C:ProgramDataAnaconda3libssl.py", line 1034, in sendall
v = self.send(byte_view[count:])
File "C:ProgramDataAnaconda3libssl.py", line 1003, in send
return self._sslobj.write(data)
ssl.SSLWantWriteError: The operation did not complete (write) (_ssl.c:2361)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:ProgramDataAnaconda3libsite-packagesrequestsadapters.py", line 449, in send
timeout=timeout
File "C:ProgramDataAnaconda3libsite-packagesurllib3connectionpool.py", line 638, in urlopen
_stacktrace=sys.exc_info()[2])
File "C:ProgramDataAnaconda3libsite-packagesurllib3utilretry.py", line 399, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='storage.googleapis.com', port=443): Max retries exceeded with url: /upload/storage/v1/b/instoxi_amazon/o?uploadType=resumable&upload_id=ABg5-Uwjc9Vs5HdgyQdhTTm0ph3N_xQIoZaAE44Oiv2MdMO6q-YhD31eRkWO6W7UNAlehUKm4FTgVv0KXq32SHmCrDU (Caused by SSLError(SSLWantWriteError(3, 'The operation did not complete (write) (_ssl.c:2361)')))

这是我的第一个问题,所以请告诉我是否有更好的提问/陈述方式。为了澄清,我在Scrapy之外使用Python与GCS交互没有任何问题。干杯

我以前见过The operation did not complete (write) (_ssl.c:2361),这是由于网络问题。这也符合这样一个事实,即它对你来说是不一致的。如果可以的话,我建议你尝试另一个网络连接到互联网。

尽管如此,我还是建议你确保你使用的是最新版本的Scrapy

最新更新