如何在pycurl中连接HTTPS代理(安全web代理)



我正在尝试使用pycurl连接到一个安全的web代理。当尝试设置适当的代理类型时,这些选项是可用的,它们对应于curl代理选项(括号中(:

- "PROXYTYPE_HTTP" (CURLPROXY_HTTP)
- "PROXYTYPE_HTTP_1_0" (CURLPROXY_HTTP_1_0)
- "PROXYTYPE_SOCKS4" (CURLPROXY_SOCKS4)
- "PROXYTYPE_SOCKS4A" (CURLPROXY_SOCKS4A)
- "PROXYTYPE_SOCKS5" (CURLPROXY_SOCKS5)
- "PROXYTYPE_SOCKS5_HOSTNAME" (CURLPROXY_SOCKS5_HOSTNAME)

但是,正如文档中所述,还有一个名为CURLPROXY_HTTPS的curl选项似乎不可用。

使用普通curl,我使用以下命令连接到代理:

curl --proxy https://proxy-host:proxy-port --proxy-insecure -U username:password https://target.com

一切都如预期。但不是用pycurl。

如何在pycurl中实现相同的行为?

根据我在pycurl github问题中得到的建议,我找到了CURLPROXY_HTTPS的选项代码,它是2

我可以通过pycurl的安全网络代理使用下一个代码提出请求:

import pycurl
from io import BytesIO
import certifi

def request_with_pycurl(username, password, host, port, target_url='https://api.ipify.org/'):
buffer = BytesIO()
c = pycurl.Curl()
c.setopt(pycurl.CAINFO, certifi.where())
# set proxy-insecure
c.setopt(c.PROXY_SSL_VERIFYHOST, 0)
c.setopt(c.PROXY_SSL_VERIFYPEER, 0)
# set headers
c.setopt(pycurl.USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0')
# set proxy
c.setopt(pycurl.PROXY, f"https://{host}:{port}")
# proxy auth
c.setopt(pycurl.PROXYUSERPWD, f"{username}:{password}")
# set proxy type = "HTTPS"
c.setopt(pycurl.PROXYTYPE, 2)
# target url
c.setopt(c.URL, target_url)
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()
body = buffer.getvalue()
return body

response = request_with_pycurl("proxy_username", "proxy_password", "proxy_host", "proxy_port").decode()
print(response)

如果上面的答案不起作用,任何其他人都来使用crawlera代理或没有密码的代理,这里是Andriy Stolyar答案的更新,

def request_with_pycurl(username, password, host, port, target_url='http://api.ipify.org/'):
buffer = BytesIO()
c = pycurl.Curl()
c.setopt(pycurl.CAINFO, certifi.where())
# set proxy-insecure
c.setopt(c.PROXY_SSL_VERIFYHOST, 0)
c.setopt(c.PROXY_SSL_VERIFYPEER, 0)
# set headers
c.setopt(pycurl.USERAGENT, 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0')
# set proxy
c.setopt(pycurl.PROXY, f"http://{host}:{port}")
# proxy auth
c.setopt(pycurl.PROXYUSERNAME, username)
# set proxy type = "HTTPS"
#c.setopt(pycurl.PROXYTYPE, 2)
# target url
c.setopt(c.URL, target_url)
c.setopt(c.WRITEDATA, buffer)
c.perform()
c.close()
body = buffer.getvalue()
return body

response = request_with_pycurl("KEY:", "", "HOST", "PORT").decode()
print(response)

最新更新