BeautifulSoup in Python 使用 .find 时出错



我知道这可能是一个简单的问题,但我真的需要帮助。

我试图从这个汤对象中提取每场比赛的总篮板。

我尝试使用以下代码,但出现错误:

import urllib.request
from bs4 import BeautifulSoup
import csv
url = "https://www.basketball-reference.com/players/a/abdulza01.html" 
request = urllib.request.Request(url) # create request object
response = urllib.request.urlopen(request) 
html = response.read().decode('unicode_escape') # convert to unicode format
soup = BeautifulSoup(html, "html.parser")    
table = soup.find('table', attrs={'id': 'per_game'})
results = table.find_all('tr')
for result in results[1:len(results)]:
data = result.find_all('td')
data.find(attrs={'data-stat': 'trb_per_g'}).getText()  
data = [<td class="center iz" data-stat="age"></td>,
<td class="left " data-stat="team_id"><a href="/teams/BOS/">BOS</a></td>,
<td class="left " data-stat="lg_id">NBA</td>,
<td class="center iz" data-stat="pos"></td>,
<td class="right " data-stat="g">2</td>,
<td class="right incomplete iz" data-stat="gs"></td>,
<td class="right " data-stat="mp_per_g">12.0</td>,
<td class="right " data-stat="fg_per_g">1.5</td>,
<td class="right " data-stat="fga_per_g">6.5</td>,
<td class="right " data-stat="fg_pct">.231</td>,
<td class="right " data-stat="ft_per_g">1.0</td>,
<td class="right " data-stat="fta_per_g">1.5</td>,
<td class="right " data-stat="ft_pct">.667</td>,
<td class="right " data-stat="orb_per_g">3.0</td>,
<td class="right " data-stat="drb_per_g">4.5</td>,
<td class="right " data-stat="trb_per_g">**7.5**</td>,
<td class="right " data-stat="ast_per_g">1.5</td>,
<td class="right " data-stat="stl_per_g">0.5</td>,
<td class="right " data-stat="blk_per_g">0.5</td>,
<td class="right " data-stat="tov_per_g">1.5</td>,
<td class="right " data-stat="pf_per_g">2.0</td>,
<td class="right " data-stat="pts_per_g">4.0</td>]

错误信息: 属性错误:结果集对象没有属性"find"。您可能将项目列表视为单个项目。当你打算调用 find() 时,你调用了 find_all() 吗?

代码在概念上有什么问题吗?

我想这是你问题的答案:美丽的汤:"结果集"对象没有属性"find_all"?

结果集对象没有属性"find"。您可以做的是访问每个元素并使用"查找"来查找所需的内容。

使用attribute值进行搜索时,还应提供tag名称。 试试下面的代码。如果只有一个元素要搜索,请尝试find。如果多个元素要搜索,请尝试find_all,然后迭代循环。希望这有帮助。

from bs4 import BeautifulSoup
html="""<html><td class="center iz" data-stat="age"></td>,
<td class="left " data-stat="team_id"><a href="/teams/BOS/">BOS</a></td>,
<td class="left " data-stat="lg_id">NBA</td>,
<td class="center iz" data-stat="pos"></td>,
<td class="right " data-stat="g">2</td>,
<td class="right incomplete iz" data-stat="gs"></td>,
<td class="right " data-stat="mp_per_g">12.0</td>,
<td class="right " data-stat="fg_per_g">1.5</td>,
<td class="right " data-stat="fga_per_g">6.5</td>,
<td class="right " data-stat="fg_pct">.231</td>,
<td class="right " data-stat="ft_per_g">1.0</td>,
<td class="right " data-stat="fta_per_g">1.5</td>,
<td class="right " data-stat="ft_pct">.667</td>,
<td class="right " data-stat="orb_per_g">3.0</td>,
<td class="right " data-stat="drb_per_g">4.5</td>,
<td class="right " data-stat="trb_per_g">**7.5**</td>,
<td class="right " data-stat="ast_per_g">1.5</td>,
<td class="right " data-stat="stl_per_g">0.5</td>,
<td class="right " data-stat="blk_per_g">0.5</td>,
<td class="right " data-stat="tov_per_g">1.5</td>,
<td class="right " data-stat="pf_per_g">2.0</td>,
<td class="right " data-stat="pts_per_g">4.0</td></html>"""
soup = BeautifulSoup(html,'html.parser')
findtag=soup.find('td',attrs={"data-stat" : "trb_per_g" })
print(findtag.text)

要搜索多个项目,请尝试此操作。

findtags=soup.find_all('td',attrs={"data-stat" : "trb_per_g" })
for tag in findtags:
print(tag.text)

我认为使用 css 选择器组合按表 id 和属性 = 值来定位感兴趣的 td 单元格会更快

import requests
from bs4 import BeautifulSoup as bs
import pandas as pd
url = "https://www.basketball-reference.com/players/a/abdulza01.html" 
soup = bs(requests.get(url).content, 'lxml')
data = [item.text for item in soup.select('#per_game [data-stat=trb_per_g]')]
df = pd.DataFrame(data)
df.rename(columns=df.iloc[0], inplace = True)
df.drop(df.index[0], inplace = True)
print(df)
df.to_csv(r'C:UsersUsersDesktopData.csv', sep=',', encoding='utf-8',index = False )

相关内容

最新更新