我想获得冲击列表中第一个对象的<p>
标记的索引。一个人会怎么做呢?
from bs4 import BeautifulSoup
import re
data = '''
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Instruments</title>
</head>
<body>
<p> Guitars are string instruments </p>
<p> Saxophones are woodwind instruments </p>
<p> Drums are percussion instruments </p>
<p> Pianos are percussion instruments</p>
</body>
'''
soup = BeautifulSoup(data)
pattern = '(?=.*percussion).*'
percussion = soup.findAll(string=re.compile(pattern))
print(percussion[0].parent.name]
使用.index
方法。例如:
from bs4 import BeautifulSoup
data = """
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Instruments</title>
</head>
<body>
<p> Guitars are string instruments </p>
<p> Saxophones are woodwind instruments </p>
<p> Drums are percussion instruments </p>
<p> Pianos are percussion instruments</p>
</body>
"""
soup = BeautifulSoup(data, "html.parser")
percussion_p = soup.find("p", text=lambda t: "percussion" in t)
all_p = soup.find_all("p")
print('Index of <p> with text "percussion" is:', all_p.index(percussion_p))
打印(0索引(:
Index of <p> with text "percussion" is: 2