我试图创建一个循环来显示li标签内的所有值,以创建一个DataFrame。此外,我只能使用:new = soup.find("div", class_="PlayerList")来隔离代码。如果我使用标准的for循环,它只显示一个值,而不是所有值。
我想显示的输出是:
梅西
拍摄9日
通过9日
处理4
<pre>
import requests
import pandas as pd
import numpy as np
from urllib.request import urlopen
from bs4 import BeautifulSoup
main_url = 'https://examplelistpython.000webhostapp.com/messi.html'
result = requests.get(main_url)
result.text
soup = BeautifulSoup(result.text, 'html.parser')
print(soup.prettify())
new = soup.find("div", class_="PlayerList")
new
</pre>
<ul class="List">
<li>
<div class="PlayerList">
<div class="HeaderList">
<span class="player">Messi</span>
</div>
<div class="PlayerStat">
<span class="stat">Shooting <span class="allStatContainer statShooting" data-stat="Shooting">
9
</span>
</span>
</div>
<div class="PlayerStat">
<span class="stat">Passing <span class="allStatContainer statPassing" data-stat="Passing">
9
</span>
</span>
</div>
<div class="PlayerStat">
<span class="stat">Tackle <span class="allStatContainer statTackle" data-stat="Tackle">
4
</span>
</span>
</div>
</li>
</ul>
player = [i.text.strip() for i in soup.find_all("span", class_="player")]
shooting = [i.text.strip() for i in soup.find_all("span", class_="allStatContainer statShooting")]
passing = [i.text.strip() for i in soup.find_all("span", class_="allStatContainer statPassing")]
tackle = [i.text.strip() for i in soup.find_all("span", class_="allStatContainer statTackle")]
df = pd.DataFrame({'Player': player, 'Shooting': shooting, 'Passing': passing, 'Tackle': tackle})
结果:
处理