使用美丽的汤从公告牌百强网站检索艺术家名称时遇到问题



我正试图使用python包BeautifulSoup从这个url中检索最流行的歌曲。当我去抓取带有艺术家名称的跨度时,它会抓取正确的跨度,但当我在跨度上调用".text"时,它不会抓取跨度标签之间的文本。

这是我的代码:

import requests
from bs4 import BeautifulSoup
r = requests.get('https://www.billboard.com/charts/hot-100/')
soup = BeautifulSoup(r.content, 'html.parser')
result = soup.find_all('div', class_='o-chart-results-list-row-container')
for res in result:
songName = res.find('h3').text.strip()
artist = res.find('span',class_='c-label a-no-trucate a-font-primary-s lrv-u-font-size-14@mobile-max u-line-height-normal@mobile-max u-letter-spacing-0021 lrv-u-display-block a-truncate-ellipsis-2line u-max-width-330 u-max-width-230@tablet-only').text
print("song: "+songName)
print("artist: "+ str(artist))
print("___________________________________________________")

目前每首歌打印以下内容:

song: Waiting On A Miracle
artist: <span class="c-label a-no-trucate a-font-primary-s lrv-u-font-size-14@mobile-max u-line-height-normal@mobile-max u-letter-spacing-0021 lrv-u-display-block a-truncate-ellipsis-2line u-max-width-330 u-max-width-230@tablet-only">
Stephanie Beatriz
</span>
___________________________________________________

如何只提取艺术家的名字?

如果类中有一个字符脱落,它就抓不住了。我只想通过获得歌曲标题来简化它,艺术家会在下一个<span>标签中跟随它。因此,像为歌曲所做的那样获得<h3>标签,然后使用.find_next()获得艺术家:

import requests
from bs4 import BeautifulSoup
r = requests.get('https://www.billboard.com/charts/hot-100/')
soup = BeautifulSoup(r.content, 'html.parser')
result = soup.find_all('div', class_='o-chart-results-list-row-container')
for res in result:
songName = res.find('h3').text.strip()
artist = res.find('h3').find_next('span').text.strip()
print("song: "+songName)
print("artist: "+ str(artist))
print("___________________________________________________")

输出:

song: Heat Waves
artist: Glass Animals
___________________________________________________
song: Stay
artist: The Kid LAROI & Justin Bieber
___________________________________________________
song: Super Gremlin
artist: Kodak Black
___________________________________________________
song: abcdefu
artist: GAYLE
___________________________________________________
song: Ghost
artist: Justin Bieber
___________________________________________________
song: We Don't Talk About Bruno
artist: Carolina Gaitan, Mauro Castillo, Adassa, Rhenzy Feliz, Diane Guerrero, Stephanie Beatriz & Encanto Cast
___________________________________________________
song: Enemy
artist: Imagine Dragons X JID
___________________________________________________
....

最新更新