我正在使用requests HTML和beautiful来抓取一个网站,下面是代码。奇怪的是,当我使用print(汤.get_text(((时,有时我可以从网络上获取文本,而当我使用print(汤(时,我会在所附的图像中获得一些随机代码。
session = HTMLSession()
r = session.get(url)
soup = bs(r.content, "html.parser")
print(soup.get_text())
#print(soup)
当我试图查看汤时,程序返回了这个
我认为该网站受到javascript保护。。我们试试这个。。它可能有助于
import requests
from bs4 import BeautifulSoup
r = requests.get(url)
print(r.text)
#if you want the whole content you can just do slicing stuff on the response stored in r or rather just do it with bs4
soup = BeautifulSoup(r.text, "html.parser")
print(soup.text)