如何获取标签的内容并使用python在美丽的汤中打印在一行中?



如何提取姓名、电子邮件和电话号码并将它们全部打印在一行中: 这是mydivs的内容

<div class="card-name"><a href="contact.php?leaduuid=9dfe">Mike <b>Denis</b></a></div>
<div class="activity-value">mdniz@gmail.com</div>
<div class="activity-value">(233) 333-9814</div>
<div class="card-name"><a href="contact.php?leaduuid=78f3">Sami <b>Baney</b></a></div>
<div class="activity-value">sadt@gmail.com</div>
<div class="activity-value">(123) 763-2322</div>

我想让输出看起来像这样:

Mike Denis, mdniz@gmail.com, (233) 333-9814
Sami Baney, sadt@gmail.com, (123) 763-2322

我能够得到的最接近的是上面的代码:

mydivs = soup.find_all('div', [ 'card-name', 'activity-value'])
for div in mydivs:
print (div)

谢谢

你可以试试这个:

from bs4 import BeautifulSoup
import re
html_doc = '''
<div class="card-name"><a href="contact.php?leaduuid=9dfe">Mike <b>Denis</b></a></div>
<div class="activity-value">mdniz@gmail.com</div>
<div class="activity-value">(233) 333-9814</div>
<div class="card-name"><a href="contact.php?leaduuid=78f3">Sami <b>Baney</b></a></div>
<div class="activity-value">sadt@gmail.com</div>
<div class="activity-value">(123) 763-2322</div>
'''
soup = BeautifulSoup(html_doc, 'html.parser')
mydivs = soup.find_all('div', [ 'card-name', 'activity-value'])
st=''
for div in mydivs:
if re.search('^([0-9][0-9][0-9])', div.text):
st+=f'{div.text}n'
else:
st+=f'{div.text}, '

print(st)

输出:

Mike Denis, mdniz@gmail.com, (233) 333-9814
Sami Baney, sadt@gmail.com, (123) 763-2322

如果你的div遵循你所有问题的结构->一个<div class="card-name">后跟两个<div class="activity-value">,那么你可以这样做:

from bs4 import BeautifulSoup
txt = '''<div class="card-name"><a href="contact.php?leaduuid=9dfe">Mike <b>Denis</b></a></div>
<div class="activity-value">mdniz@gmail.com</div>
<div class="activity-value">(233) 333-9814</div>
<div class="card-name"><a href="contact.php?leaduuid=78f3">Sami <b>Baney</b></a></div>
<div class="activity-value">sadt@gmail.com</div>
<div class="activity-value">(123) 763-2322</div>'''
soup = BeautifulSoup(txt, 'html.parser')
divs = soup.select('.card-name, .activity-value')
for name, email, phone in zip(divs[::3], divs[1::3], divs[2::3]):
print('Name: {}tE-Mail: {}t Phone: {}'.format(name.text, email.text, phone.text))

指纹:

Name: Mike Denis    E-Mail: mdniz@gmail.com  Phone: (233) 333-9814
Name: Sami Baney    E-Mail: sadt@gmail.com   Phone: (123) 763-2322

最新更新