如何在 soup.findAll('tag1', 'tag2', 'tag3')中迭代多个标签?



我想写一个python脚本,其中修改多个html文件中的某些标签将自动;从终端运行单个命令。

我构造了代码库。

在我的代码库中,我做了如下的事情。有没有更方便的方法用更少的代码做到这一点?
#modifying the 'src' of <img> tag in the soup obj
for img in soup.findAll('img'):
img['src'] = '{% static ' + "'" + img['src'] + "'" + ' %}'
#modifying the 'href' of <link> tag in the soup obj
for link in soup.findAll('link'):
link['href'] = '{% static ' + "'" + link['href'] + "'" + ' %}'
#modifying the 'src' of <script> tag in the soup obj
for script in soup.findAll('script'):
script['src'] = '{% static ' + "'" + script['src'] + "'" + ' %}'

例如,我可以在一个For循环中而不是3个吗?并不是说它必须像我下面写的那样,任何好的实践建议都是我正在寻找的。

for img, link, script in soup.findAll('img', 'link', 'script'):
rest of the code goes here....

也许使用字典检索适当的属性?另外,使用更快的css选择器。

import requests
from bs4 import BeautifulSoup as bs
r = requests.get('https://stackoverflow.com/questions/66541098/how-can-i-iterate-over-multiple-tags-in-soup-findalltag1-tag2-tag3')
soup = bs(r.content, 'lxml')
lookup = {
'img':'src',
'link': 'href',
'script':'src'
}
for i in soup.select('img, link, script'):
var = lookup[i.name]
if i.has_attr(var):
i[var] = '{% static ' + "'" + i[var] + "'" + ' %}'
print(i[var])

可以。你可以传递一个元素列表给findAll方法

for element in soup.findAll(['img', 'link', 'script']): # use find_all for bs4

if element.name == 'img':
value = element['src']
elif element.name == 'href':
value = element['href']
elif element.name == 'script':
value = element['src']
else:
continue

print(val)

最新更新