如何使用Beautiful Soup通过span类提取准确的信息



这是我的价格监控代码的代码和输出:

soup = BeautifulSoup(page.content, 'html.parser')
title = soup.find(id="result_0_name").get_text()
price = soup.find("span", class_ = "normal_price")
#converted_price = price[0:3]
print(price.get_text())
print(title.strip())

输出如下

Starting at:
$0.70 USD
$0.67 USD
Operation Broken Fang Case

和HTML的页面是一样的

<span class="market_table_value normal_price">Starting at:<br/>
<span class="normal_price" data-currency="1" data-price="69">$0.69 USD</span>
<span class="sale_price">$0.66 USD</span>
</span>

如您所见,没有ID,所以我不能使用它,我只希望显示'normal_price'而不是该范围内的其他数据。什么好主意吗?

让span的选择更具体一些,例如,它是元素中的元素:

soup.select_one('span > span.normal_price').get_text()

from bs4 import BeautifulSoup
html='''
<span class="market_table_value normal_price">
Starting at:<br/>
<span class="normal_price" data-currency="1" data-price="69">$0.69 USD</span>
<span class="sale_price">$0.66 USD</span>
</span>
'''
soup = BeautifulSoup(html,'html.parser')
price = soup.select_one('span > span.normal_price').get_text()
price

$0.69 USD

你也可以这样尝试

from bs4 import BeautifulSoup
html ="""<span class="market_table_value normal_price">
Starting at:<br/>
<span class="normal_price" data-currency="1" data-price="69">$0.69 USD</span>
<span class="sale_price">$0.66 USD</span>
</span>"""
soup = BeautifulSoup(html, 'html.parser')
using attribute
price = soup.select_one('span[data-currency="1"]').get_text()
exact attribute
price = soup.select_one('span[data-currency^="1"]').get_text()
print(price) #$0.69 USD

最新更新