HTTP 错误 508:循环检测到 python urllib.request

我正在用下面的代码抓取一个网站，在我运行两次后，第三个向我显示错误

HTTP 错误 508：检测到循环

req = Request(url, headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
soup=BeautifulSoup(webpage)
liList=soup.find('div',attrs={'class':'columns-list'})
links=[]
for a in liList.find_all('a'):
req = Request(a.attrs['href'], headers={'User-Agent': 'Mozilla/5.0'})
webpage = urlopen(req).read()
data=BeautifulSoup(webpage)
h=data.find("div",attrs={'class':'first-h2'})
print(h.h2.text)
print(data.find("h5"))

如何防止这种情况？有时它有效，有时它会给出此错误

猜想这是一种"内部服务器错误"，它表示服务器进入循环，如下所述： https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/508

It indicates that the server terminated an operation because it encountered
an infinite loop while processing a request with "Depth: infinity". This
status indicates that the entire operation failed.

所以，这是服务器错误，不是你的

相关内容

最新更新

热门标签：