如何抓取<p>标签元素的最后一串？

首先，python是我学习的第一门语言。我正在抓取我所在城市的租金价格网站，我使用BeautifulSoup来获取价格数据，但是我无法获得这个

标签的值。

标签如下:

<p><strong class="hidden show-mobile-inline">Monthly Rent: </strong>2,450 +</p>

下面是我的代码:

text = soup.find_all("div", {"class", "plan-group rent"})
for item in text:
rent = item.find_all("p")
for price in rent:
print(price.string)

我也试过:

text = soup.find_all("div", {"class", "plan-group rent"})
for item in text:
rent = item.find_all("p")
for price in rent:
items = price.find_all("strong")
for item in items:
print('item.string')

可以打印出"Monthly Rent:"但是我不明白为什么我不能得到实际的价格。上面的代码告诉我，月租金在强标签中，这意味着p标签只包含我想要的价格。

正如@kyrony所提到的，在您的<p>中有两个孩子-因为您选择了<strong>，您将只获得其中一个文本。

你可以使用不同的方法stripped_strings:

list(soup.p.stripped_strings)[-1]

或contents

soup.p.contents[-1]

或带recursive参数

soup.p.find(text=True, recursive=False)

from bs4 import BeautifulSoup
html = '''<p><strong class="hidden show-mobile-inline">Monthly Rent: </strong>2,450 +</p>'''
soup = BeautifulSoup(html)
soup.p.contents[-1]

技术上来说你的内容有两个子元素

<p><strong class="hidden show-mobile-inline">Monthly Rent: </strong>2,450 +</p>

强标签

<strong class="hidden show-mobile-inline">Monthly Rent: </strong>

和字符串

2,450 +

beautiful soup中的string方法只接受一个参数，所以它将返回None。为了获得第二个字符串，您需要使用stripped_strings生成器。

`from bs4 import BeautifulSoup html = '''<p><strong class="hidden show-mobile-inline">Monthly Rent: </strong>2,450 +</p>''' soup = BeautifulSoup(html) soup.p.contents[-1]`

相关内容

最新更新

热门标签：