request.get() is getting stuck



您好,我正在尝试从网站上抓取一些数据,并且request.get()正在陷入困境。这是我的代码:

page_url = front_end+str(i)+'/'
page = requests.get(page_url)

所以我希望它是一个字符串,因为我只是输入一个 url,如果我停止代码或它运行太长,我会得到类似的东西:

File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", 
line 377, in _make_request
    httplib_response = conn.getresponse(buffering=True)
TypeError: getresponse() got an unexpected keyword argument 'buffering'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "main.py", line 24, in <module>
    page = requests.get(page_url)
  File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 380, in _make_request
    httplib_response = conn.getresponse()
  File "/usr/local/lib/python3.6/http/client.py", line 1331, in getresponse
    response.begin()
  File "/usr/local/lib/python3.6/http/client.py", line 297, inbegin
    version, status, reason = self._read_status()
  File "/usr/local/lib/python3.6/http/client.py", line 258, in_read_status
    line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
  File "/usr/local/lib/python3.6/socket.py", line 586, in readinto
    return self._sock.recv_into(b)
  File "/usr/local/lib/python3.6/ssl.py", line 1002, in recv_into
    return self.read(nbytes, buffer)
  File "/usr/local/lib/python3.6/ssl.py", line 865, in read
    return self._sslobj.read(len, buffer)
  File "/usr/local/lib/python3.6/ssl.py", line 625, in read
    v = self._sslobj.read(len, buffer)

我不明白TypeError: getresponse() got an unexpected keyword argument 'buffering'是什么意思或如何解决它。

答案:

有时来自requests.get()的请求会被服务器阻止,所以解决方案是让服务器认为请求来自Web浏览器。例:

 headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/68.0.3440.106 Safari/537.36',}
 page = requests.get("https://example.com", headers=headers)

最新更新