使用美丽汤进行此获取方法调用有什么问题?



我正在尝试抓取一个网页。执行此代码时,输出running1,而不输出running2。为什么会这样呢?

代码:

from time import gmtime, strftime
import requests
from bs4 import BeautifulSoup
import smtplib
from email.mime.text import MIMEText
print("running1")
url = "https://www.johnlewis.com/nordictrack-commercial-14-9-elliptical-cross-trainer/p5639979"
response = requests.get(url)
print("running2")
soup = BeautifulSoup(response.text, 'lxml')
print("running3")

要从服务器获得正确的响应,请尝试指定User-AgentHTTP头:

import requests
headers = {
"User-Agent": "Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:101.0) Gecko/20100101 Firefox/101.0"
}
url = "https://www.johnlewis.com/nordictrack-commercial-14-9-elliptical-cross-trainer/p5639979"
response = requests.get(url, headers=headers)
print(response.text)

打印:

<!DOCTYPE html><html lang="en"><head>
...

最新更新