如何从HTML内容打印一个特定的值?



下面是HTML内容,我想要的是HTML内容中唯一可用的值

<div class="list-group-item">
<div class="row">
<div class="col" style="min-width: 0;">
<h2 class="h5 mt-0 text-truncate">
<a class="text-warning" href="www.example.com">
Ram
</a>
</h2>
<p class="mob-9 text-truncate">
<small>
<i class="fa fa-fw fa-mobile-alt">
</i>
Contact:
</small>
010101010
</p>
<p class="mb-2 text-truncate">
<small>
<i class="fa fa-fw fa-map-marker-alt">
</i>
Location:
</small>
5th lane, kamathipura, Kamathipura
</p>
</a>
</p>
</div>
</div>
</div>

我的代码是-

import pandas as pd
import requests
from bs4 import BeautifulSoup as soup
url = requests.get("www.example.com")
page_soup = soup(url.content, 'html.parser')
name = shop.findAll("div", {"class": "list-group-item"})
print(name.h2.text)
number = shop.findAll("p", {"class": "fa fa-fw fa-map-marker-alt"})
print(?)
location = shop.findAll("p", {"class": "fa fa-fw fa-map-marker-alt"})
print(?)

我需要使用python -

输出'Ram', '010101010', '5th lane, kamathipura, kamathipura '

使用标签和类标识符,您可以获取所需区域内的所有内容。然后使用内容索引,您应该能够像这样选择您想要的确切内容:

from bs4 import BeautifulSoup
url = 'myhtml.html'
with open(url) as fp:
soup = BeautifulSoup(fp, 'html.parser')
contnt1 = [soup.find('a').contents[0].replace(' ','').replace('n','')]
contnt2 = [x.contents[2].replace(' ', '').replace('n', '') for x in soup.find_all("p", "text-truncate")]
print(*(contnt1 + contnt2))

您试过location.get_text()吗?

你可以点击这里阅读更多。