使用web抓取在HTML中查找某个标记

我正在用for循环对不同的html页面进行web抓取，我需要为每个页面找到特定的标记(我使用BeautifulSoup和find_all方法(。但并不是所有有标签的页面都存在。所以我需要找到一个简单的方法来检查这个标签是否存在。我试着写这段代码是为了检查标记是否不存在，但它不起作用。

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [92], in <cell line: 5>()
36 sal_play = salary.find_all('tr')[1:]
37 print(sal_play)
---> 38 if sal_play.find_all('tr', class_='thead') is None :
39     print('1')
40 else:
AttributeError: 'list' object has no attribute 'find'

正如错误消息所说，您不能直接在列表上运行find，您必须在每个项目上运行它

如果您只想在没有标题行的情况下打印"1"，请使用：

if not [s for s in sal_play if s.find('tr', class_='thead')]: 
print('1')

或者，如果您想打印"1"的长度，其中一些没有标题行，请使用：

if [s for s in sal_play if s.find('tr', class_='thead') is None]: 
print('1')

顺便说一句，如果标签不存在，find_all将返回一个空列表([](，find将返回None，因此if ...find_all(....) is None: do x将几乎确保x永远不会发生。。。。

相关内容

最新更新

热门标签：