如何使用<h1>美丽汤抓取标签?[蟒蛇]



我目前正在为不同的网站编写价格跟踪器,但遇到了一个问题。我正在尝试使用BeautifulSoup4抓取h1标签的内容,但我不知道如何。我试着按照中的建议使用字典https://stackoverflow.com/a/40716482/14003061,但它返回了None。有人能帮忙吗?非常感谢!

这是代码:

from termcolor import colored
import requests
from bs4 import BeautifulSoup
import smtplib
def choice_bwfo():
print(colored("You have selected Buy Whole Foods Online [BWFO]", "blue"))
url = input(colored("n[ 2 ] Paste a product link from BWFO.n", "magenta"))
url_verify = requests.get(url, headers=headers)
soup = BeautifulSoup(url_verify.content, 'html5lib')
item_block = BeautifulSoup.find('h1', {'itemprop' : 'name'})
print(item_block)
choice_bwfo()

下面是一个可以使用的示例URL:

https://www.buywholefoodsonline.co.uk/organic-spanish-bee-pollen-250g.html

谢谢:(

此脚本将打印<h1>标签的内容:

import requests
from bs4 import BeautifulSoup

url = 'https://www.buywholefoodsonline.co.uk/organic-spanish-bee-pollen-250g.html'
# create `soup` variable from the URL:
soup = BeautifulSoup(requests.get(url).content, 'html.parser')
# print text of first `<h1>` tag:
print(soup.h1.get_text())

打印:

Organic Spanish Bee Pollen 250g

或者你可以做:

print(soup.find('h1', {'itemprop' : 'name'}).get_text())

最新更新