从需要使用请求登录的网站下载pdf文件,python3



我有一个网站,我想使用请求下载pdf,该网站要求您登录,然后您才能访问pdf文件。

我正在使用这个脚本,但它不起作用,问题是什么?我使用了另一篇文章中的一些代码,但不知道如何解决这个问题!!!

import requests
import sys
from bs4 import BeautifulSoup
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Safari/537.36'
}

login_data = {
'Email': 'My-email',
'Password': 'My-password',
'login': 'Login'
}

url = 'https://download-website' #The website i want to download the file from
filename = 'filename.pdf'
# creating a connection to the pdf
print("Creating the connection ...")

with requests.session() as s:
url1 = 'https://login-website/' #The website i want to log in into
r = s.get(url1, headers=headers, stream=True)
soup = BeautifulSoup(r.content, 'html5lib')
login_data['__RequestVerificationToken'] = soup.find('input', attrs={'name':'__RequestVerificationToken'})['value']
r = s.post(url1, data=login_data, headers=headers, stream=True)
with requests.get(url, stream=True) as r:

if r.status_code != 200:
print("Could not download the file '{}'nError Code : {}nReason : {}nn".format(
url, r.status_code, r.reason), file=sys.stderr)
else:
# Storing the file as a pdf
print("Saving the pdf file  :n"{}" ...".format(filename))
with open(filename, 'wb') as f:
try:
total_size = int(r.headers['Content-length'])
saved_size_pers = 0
moversBy = 8192*100/total_size
for chunk in r.iter_content(chunk_size=8192):
if chunk:
f.write(chunk)
saved_size_pers += moversBy
print("r=>> %.2f%%" % (
saved_size_pers if saved_size_pers <= 100 else 100.0), end='')
print(end='nn')
except Exception:
print("==> Couldn't save : {}\".format(filename))
f.flush()
r.close()
r.close()

我只能猜测,因为我不知道网站的链接。尝试用小写写用户数据的键。如果这不起作用,请尝试使用浏览器的开发工具来了解网站的注册表格所期望的内容。

最新更新