我正在尝试创建一个包含游戏及其时间表信息的网站。最初,我成功地将所有相关数据导入到我的程序中;然而,一旦游戏开始,这种情况就发生了变化。该网站从其显示中删除了"时间"列,这导致导入到我的程序中的列数量不均匀-比以前少一个,因为没有"时间"列了!这造成了问题,因为现在当我试图从收集的信息中构造一个数据框时,由于每行中的条目数量不等,它将无法正常工作。我想只导入那些尚未播放的。
import requests
from bs4 import BeautifulSoup
link = "https://www.espn.com/nfl/schedule/_/week/1/year/2022/seasontype/3"
page = requests.get(link)
soup = BeautifulSoup(page.content,"html.parser")
nfl_resp = soup.find_all('div',class_='ResponsiveTable')
visit = i.find_all(class_="events__col Table__TD")
nfl_list = []
nfl_time_list = []
nfl_location_list = []
visit_list = []
`for i in nfl_resp:`
location = i.find_all(class_='location__col Table__TD')
for team in location:
nfl_location_list.append(team.text)
#I get all the correct stadiums
for i in nfl_resp:
time = i.find_all(class_='date__col Table__TD')
for hour in time:
nfl_time_list.append(hour.text)
#I get all the correct times
for i in nfl_resp:
location = i.find_all(class_='location__col Table__TD')
for team in location:
nfl_location_list.append(team.text)
#I get all dates correctly
for team in visit:
visit_list.append(team.text)
#Here's the problem, I get all the games regardless if they started or not.
#It only works if the games are yet to start, I need to run it when the games are running or over too.
您可以使用以下示例来解析来自ESPN网站的各种信息:
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = "https://www.espn.com/nfl/schedule/_/week/1/year/2022/seasontype/3"
soup = BeautifulSoup(requests.get(url).content, "html.parser")
all_data = []
for row in soup.select(".Table__TR:has(.AnchorLink)"):
data = [t.text for t in row.select(".AnchorLink:not(:has(img))")]
networks = [
n["alt"] if n.name == "img" else n.text
for n in row.select(".network-container img, .network-container .network-name")
]
date = row.find_previous(class_="Table__Title").text.strip()
all_data.append([*data, networks, date])
df = pd.DataFrame(
all_data,
columns=["Team 1", "Team 2", "Time", "Tickets", "Stadium", "Networks", "Date"],
)
print(df)
打印:
Team 1 Team 2 Time Tickets Stadium Networks Date
0 Seattle San Francisco 4:30 PM Tickets as low as $138 Levi's Stadium, Santa Clara, CA [FOX] Saturday, January 14, 2023
1 Los Angeles Jacksonville 8:15 PM Tickets as low as $138 TIAA Bank Field, Jacksonville, FL [NBC] Saturday, January 14, 2023
2 Miami Buffalo 1:00 PM Tickets as low as $114 Highmark Stadium, Orchard Park, NY [CBS] Sunday, January 15, 2023
3 New York Minnesota 4:30 PM Tickets as low as $116 U.S. Bank Stadium, Minneapolis, MN [FOX] Sunday, January 15, 2023
4 Baltimore Cincinnati 8:15 PM Tickets as low as $171 Paycor Stadium, Cincinnati, OH [NBC] Sunday, January 15, 2023
5 Dallas Tampa Bay 8:15 PM Tickets as low as $163 Raymond James Stadium, Tampa, FL [ESPN, ABC, ESPN+] Monday, January 16, 2023