该网站存在,但请求.head/get超时



我写了一个Python脚本来检查网站是否存在。一切正常,除了检查时http://www.dhl.com-请求超时。我尝试过GET和HEAD方法。我曾经https://httpstatus.io/和https://app.urlcheckr.com/查看DHL网站,结果是错误的。DHL网站确实存在!这是我的代码:

import requests
a ='http://www.dhl.com'
def check(url):
try:
header = {'User-Agent':'Mozilla/5.0 (X11; CrOS x86_64 8172.45.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.64 Safari/537.36'}
request = requests.head(url, headers = header , timeout = 60)
code = request.status_code
if code < 400:
return "Exist",str(code)
else:
return "Not exist", str(code)
except Exception as e:
return "Not Exist",str(type(e).__name__)
print(check(a))

如何解决此错误?

使用curl进行的测试表明,您需要为DHL站点提供几个其他头文件

import requests
url = 'http://www.dhl.com'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.135 Safari/537.36',
'Accept-Encoding': 'gzip, deflate, br',
'Accept-Language': 'en-US,en;q=0.9,fil;q=0.8',
}
request = requests.head(url, headers=headers, timeout=60, allow_redirects=True)
print(request.status_code, request.reason)
print(request.history)

如果没有这些头,curl永远不会得到响应。

最新更新