使用findall获取bs4元素python的title子项



我正在使用find_all方法来获得一些元素,比如:

elements = soup.find_all('a', {'class': "watchlink"})

以获得以下元素:

<a class="watchlink" href="https://www1.swatchseries.to/freecale.html?r=iexyJrjdCI6InRuT0tnVpFR1VjBVZUmNFZHJiN09tLWENDT25oOGNrS3c0SkkzTDRLSXBUK1VCUXlOd0NJNW1uWWJkWVUrRkluejciLCJpdiI6IjEyNjViZTg2NTU3NWVkN2ZlNDZiNWVjZTA5NjkxNDE2IiwicyI6IjBhZGYxOGNmY2ExMzA5YjEifQ==" onclick="$(this).css('color','#AE3939'); $(this).css('text-decoration','line-through');" rel="nofollow" style="cursor:pointer;" target="_blank" title="mixdrop.co">Watch<span> This</span> Link!</a>

我不能得到瓷砖只有这样:

for x in elements:
print(x['title'])

我得到这个错误:

Traceback (most recent call last):
File "C:/Users/CobraCommander/PycharmProjects/999/get_vidtodo_links.py", line 25, in <module>
print(x['title'])
File "C:Python37libsite-packagesbs4element.py", line 1016, in __getitem__
return self.attrs[key]
KeyError: 'title'

然而,它确实适用于其他属性,如";href"例如

如何获得标题:";mixdrop.co";从我的元素?

只需确保使用title访问正确的anchor标签,因为我相信存在具有相同类名但没有title的标签

from bs4 import BeautifulSoup
html = '''<a class="watchlink" href="https://www1.swatchseries.to/freecale.html?r=iexyJrjdCI6InRuT0tnVpFR1VjBVZUmNFZHJiN09tLWENDT25oOGNrS3c0SkkzTDRLSXBUK1VCUXlOd0NJNW1uWWJkWVUrRkluejciLCJpdiI6IjEyNjViZTg2NTU3NWVkN2ZlNDZiNWVjZTA5NjkxNDE2IiwicyI6IjBhZGYxOGNmY2ExMzA5YjEifQ==" onclick="$(this).css('color','#AE3939'); $(this).css('text-decoration','line-through');" rel="nofollow" style="cursor:pointer;" target="_blank" title="mixdrop.co">Watch<span> This</span> Link!</a>'''

soup = BeautifulSoup(html, 'html.parser')
goal = [item['title'] for item in soup.findAll(
"a", {'class': 'watchlink', 'title': True})]
print(goal)

输出:

['mixdrop.co']

相关内容

最新更新