我想在第一个ul标签下的所有li标签中抓取所有带有href attrs的a标签。下面的代码抓取所有ul标签中所有li标签下的所有a标签。(我只想在第一个ul标签下(。你可以看到网站。https://www.mindtools.com/pages/main/newMN_CDV.htm我的代码是:
for ultag in soup.find_all('ul', {'class': 'collection test_further_resource'}):
for litag in ultag.find_all('li'):
print(litag.find("a")["href"])
请查看网站。请转到";按类别浏览工具";以及";思考职业方向;。我想把这个类别的href刮到13。提前谢谢。
.find()
将返回它找到的第一个匹配的标记,而.find_all()
将返回所有匹配的列表。
import requests
from bs4 import BeautifulSoup
url = 'https://www.mindtools.com/pages/main/newMN_CDV.htm'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
ultag = soup.find('ul', {'class': 'collection test_further_resource'})
for litag in ultag.find_all('li'):
print(litag.find("a")["href"])
输出:
/pages/article/managing-career.htm
/pages/article/newCDV_97.htm
/pages/article/career-strategy.htm
/pages/article/personal-ansoff-matrix.htm
/pages/article/career-opportunities.htm
/pages/article/managing-yourself.htm
/pages/article/newCDV_99.htm
/pages/article/newCDV_89.htm
/pages/article/rebooting-your-career.htm
/pages/article/newCDV_98.htm
/pages/article/locus-of-control.htm
/pages/article/newCDV_90.htm
/pages/article/seat-on-the-board.htm