如何使用Python请求提交带有不可见reCAPTCHA的表单



我想使用Python发送匿名电子邮件。我使用一个名为emkei.cz的在线匿名电子邮件发送工具。我想以编程方式使用这个工具。

如何在网站(emkei.cz(上填写该表格并提交以使用python-requests发送匿名电子邮件?

我不想使用像seleniummechanize这样的东西,因为它们很慢(即使我无头地运行selenium(,并且对于我可以通过请求模拟的基本HTML表单来说是不需要的。

我尝试过的

我填写了表格,并在Microsoft Edge开发工具的网络选项卡中检查了提交表格时提出的请求。我尝试使用Python中的requests库来模拟这些请求。

邮件已成功发送。我记下了标题和有效负载(数据(。我用相同的表单值编写了一个简单的Python脚本,试图发送邮件。然而,它没有起作用。

为了调试,我检查了r.text。它返回的是我刚刚填写的相同输入表,而不是成功消息"电子邮件发送成功"。

这是我正在使用的代码:

import requests

def send_email(to, subject, body, debug):
headers = {
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9",
"Accept-Encoding": "gzip, deflate, br",
"Accept-Language": "en-US,en;q=0.9",
"Cache-Control": "max-age=0",
"Connection": "keep-alive",
"Content-Length": "3072",
"Content-Type": "multipart/form-data; boundary=----WebKitFormBoundaryIGzbcUtI3oNRwVLD",
"Cookie": "__gads=ID=a33e3b44296022c7-22066d337bd100ce:T=1648910614:RT=1648910614:S=ALNI_MZLGzNvZhCPKcpiV2aS8Nkg4um4SQ",
"Host": "emkei.cz",
"Origin": "null",
"sec-ch-ua": '" Not A;Brand";v="99", "Chromium";v="99", "Microsoft Edge";v="99"',
"sec-ch-ua-mobile": "?0",
"sec-ch-ua-platform": '"Windows"',
"Sec-Fetch-Dest": "document",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Site": "same-origin",
"Sec-Fetch-User": "?1",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/99.0.4844.74 Safari/537.36 Edg/99.0.1150.55"
}
payload = {
"fromname": "LifeAsAnRPG Team",
"from": "team@lifeasanrpg.com",
"rcpt": to,
"subject": subject,
"attachment": "(binary)",
"reply": "",
"errors": "",
"cc": "",
"bcc": "",
"importance": "normal",
"xmailer": "0",
"customxm": "",
"confirmd": "",
"confirmr": "",
"addh": "",
"smtp": "",
"smtpp": "",
"current": "on",
"charset": "utf-8",
"mycharset": "",
"encrypt": "no",
"ctype": "plain",
"rte": "0",
"text": body,
"g-recaptcha-response": "03AGdBq24yOL4Cas-N8rzxpVSHZnJR0Ec7V_8tylGd_6IpLZotF1hqQZo2Ukyt9qw3CWAqDV7onb2TeJ25cTx9fWPf9icUaK8QCE3HGoxFMO9wYvXB5RNDSkQGbpuU_7mRZl_RDs3RVx6Savi0-PENoz1fvfUBmcKhPbPDXnRWyfayDjS1DrTU0hTivr2Xkp4W3KxBpPBg0lp7W_hgujMxqa5fjXz46Do9ZUq3G2DCRciuwBLYXS3v9nSEW1wqhFtdWfRbby50iougT0DGAWzN5vbs6o0X7YzTit6uyNO2zF0-ZECTH6YNpTMgdlC4t4QquS0-BhXPBOdDCICccYafyGoQgioaPcQt--NfaPFSYvLnVhCjFJ2y2Kl7sFFviGn-lgnvK65NpSKlNjYrSHB29LsLcF1zghmwjPZtWJ7q7rljAhz7rH9Iyxs",
"ok": "Send"
}
request = requests.request(
method="POST",
url="https://emkei.cz/",
headers=headers,
data=payload
)
if debug:
print(request.text)
print(request.status_code)
if request.status_code != 200:
return -1
return 0

send_email(
to="test-test@mailinator.com",
subject="Test subject.",
body="""
Test line one.
Test line two.
""",
debug=True
)

我的猜测是,这与captcha和g-recaptcha-response有效载荷有关。不过,在填写表格时,我没有被要求提供任何captcha。

请尝试访问网站(Emkei的匿名邮件,也链接到上面(,并告诉我如何通过它以编程方式发送电子邮件。

存在不可见的reCAPTCHA,因此需要渲染页面以获得g-recaptcha-response
https://developers.google.com/recaptcha/docs/versions#recaptcha_v2_invisible_recaptcha_badge

您可以使用requests-html,它将在第一次渲染时自动下载Chromium
https://pypi.org/project/requests-html/

  1. 在POST请求之前,在send_email函数中渲染页面并设置captcha:

    from requests_html import HTMLSession
    session = HTMLSession()
    response = session.get("https://emkei.cz/")
    # response.html.render()
    for _ in range(10):
    if response.html.search('name="g-recaptcha-response" value="{}"') is None:
    response.html.render()
    payload['g-recaptcha-response'] = response.html.search('name="g-recaptcha-response" value="{}"')[0]
    
  2. 注释"Content-Type": "multipart/form-data; boundary=...标题:

    # "Content-Type": "multipart/form-data; boundary=...
    

    您不应该指定自己的边界,因为多部分数据是由requests构建的。https://github.com/psf/requests/issues/1997

  3. request.status_code != 200旁边添加以下故障检查:

    # if "The invisible reCAPTCHA test wasn't successful. Please, try again." in request.text:
    #     return -1
    if "E-mail sent successfully" not in request.text:
    return -1
    

在Windows上,您可能会遇到requests-html试图下载Chromium的问题
https://github.com/psf/requests-html/issues/325

不可见的reCAPTCHA仍可能阻止请求

最初,单个response.html.render()运行良好。

在1小时内运行xx次后,我需要for _ in range(10),在render():期间偶尔会得到TimeoutError

pyppeteer.errors.TimeoutError:超过导航超时:超过8000毫秒。

在1小时内运行xxx次后,https://emkei.cz/大多返回CCD_ 19。

最新更新