我正试图使用urllib for Python来打开URL,但我在一个特定的URL上遇到了一个错误,与其他URL相比,这个错误看起来很正常,在浏览器中也能很好地工作。
产生错误的代码是:
import cStringIO
import imghdr
import urllib2
response = urllib2.urlopen('https://news.artnet.com/wp-content/news-upload/2015/08/Brad_Pitt_Fury_2014-e1440597554269.jpg')
然而,如果我用类似的URL做完全相同的事情,我不会得到错误:
import cStringIO
import imghdr
import urllib2
response = urllib2.urlopen('https://upload.wikimedia.org/wikipedia/commons/d/d4/Brad_Pitt_June_2014_(cropped).jpg')
我在第一个例子中得到的错误是:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 154, in urlopen
return opener.open(url, data, timeout)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 431, in open
response = self._open(req, data)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 449, in _open
'_open', req)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1240, in https_open
context=self._context)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/urllib2.py", line 1197, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [SSL: TLSV1_ALERT_INTERNAL_ERROR] tlsv1 alert internal error (_ssl.c:590)>
这是一个使用Cloudflare SSL的站点,它需要服务器名称指示(SNI)。如果没有SNI访问此网站,将显示您在此处看到的行为,即触发tlsv1警报。SNI仅在2.7.9版本中添加到Python 2.7中,您可能使用的是旧版本的Python。
我在urllib3文档中找到了一个非常好的解释。
以下代码解决了Python 2.7.10的问题:
import urllib3
import ssl
import certifi
urllib3.contrib.pyopenssl.inject_into_urllib3()
# Open connection
http = urllib3.PoolManager(
cert_reqs='CERT_REQUIRED', # Force certificate check.
ca_certs=certifi.where(), # Path to the Certifi bundle.
)
# Make verified HTTPS requests.
try:
resp = http.request('GET', url_photo)
except urllib3.exceptions.SSLError as e:
# Handle incorrect certificate error.
print "error with https certificates"