使用 Python 从单个请求中获取 html 和标头

我正在研究使用 python 发出单个 http 请求来检索 html 和 http 标头信息的可能性，而不必进行 2 次单独的调用。

有人知道有什么好方法吗？

另外，发出这些请求的不同方法之间的性能差异是什么，例如urllib2和httpconnection等。

只需使用 urllib2.urlopen() .可以通过调用返回对象的 read() 方法来检索 HTML，标头在 headers 属性中可用。

import urllib2
f = urllib2.urlopen('http://www.google.com')
>>> print f.headers
Date: Fri, 08 Jun 2012 12:57:25 GMT
Expires: -1
Cache-Control: private, max-age=0
Content-Type: text/html; charset=ISO-8859-1
Server: gws
X-XSS-Protection: 1; mode=block
X-Frame-Options: SAMEORIGIN
Connection: close
>>> print f.read()
<!doctype html><html itemscope itemtype="http://schema.org/WebPage"><head><meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
... etc ...

如果你使用HTTPResponse，你可以用两个函数调用来获取标头和内容，但它不会对服务器进行两次访问。

相关内容

最新更新

热门标签：