完全是新手,试图破解一个项目。
目标:为大学足球比赛创建一个由各种体育博彩给出的赔率数据表
问题:在调用API并转换为数据帧后,我发现概率列包含一个嵌套的字典列表。样本值:
[{'key': 'fanduel', 'title': 'FanDuel', 'last_update': '2022-12-09T06:41:42Z', 'markets':
[{'key': 'spreads', 'outcomes':
[{'name': 'Army Black Knights', 'price': -108, 'point': 2.5}, {'name': 'Navy Midshipmen', 'price': -112, 'point': -2.5}]}]},
{'key': 'williamhill_us', 'title': 'William Hill (US)', 'last_update': '2022-12-09T06:41:22Z', 'markets':
[{'key': 'spreads', 'outcomes':
[{'name': 'Army Black Knights', 'price': -110, 'point': 2.5}, {'name': 'Navy Midshipmen', 'price': -110, 'point': -2.5}]}]},
当我尝试使用json_normalize时,我得到了错误:&;字符串索引必须是整数&;
下面是我的代码:import requests
import pandas as pd
odds = requests.get(" https://api.the-odds-api.com/v4/sports/americanfootball_ncaaf/odds/?apiKey=___(user key)__®ions=us&markets=spreads,spreads&oddsFormat=american")
odds_df = pd.read_json(odds.text)
odds_df.head()
#excluding non-needed columns in this example:
| id | home_team | away_team | bookmakers
| 123 | Army | Navy | [{'key': 'fanduel', 'title': 'FanDuel', 'last_...
| 456 | UAB | Miami (OH)| [{'key': 'barstool', 'title': 'Barstool Sports...
df = pd.json_normalize(odds_df,'bookmakers',['id', 'home_team', 'away_team'])
TypeError: string index必须是整数
Expected outcome:
| id | home_team | away_team | sportsbook | last_update | odds_type | home_point
| 123 | Army | Navy | fanduel | 2022-12-09T06:41:42Z| spread | 2.5
| 123 | Army | Navy | williamhill_us| 2022-12-09T06:41:22Z| spread | 2.5
你可以试试这个吗:
odds_df=odds_df.explode('bookmakers').reset_index(drop=True) #explode bookmarks col
#normalize bookmakers column and join odds_df. Because we need id, home_team eg. title is your sportsbook column here. if it si wrong change to key.
df = odds_df.join(pd.json_normalize(odds_df['bookmakers'],record_path=['markets'],meta=['title','last_update'],meta_prefix='_')).drop('bookmakers',axis=1).explode('outcomes').reset_index(drop=True)
df =df.join(pd.json_normalize(df.pop('outcomes'))).drop_duplicates()
'''
| | id | home_team | away_team | key | _title | _last_update | name | price | point |
|---:|-----:|:------------|:------------|:--------|:------------------|:---------------------|:-------------------|--------:|--------:|
| 0 | 123 | Army | Navy | spreads | FanDuel | 2022-12-09T06:41:42Z | Army Black Knights | -108 | 2.5 |
| 1 | 123 | Army | Navy | spreads | FanDuel | 2022-12-09T06:41:42Z | Navy Midshipmen | -112 | -2.5 |
| 2 | 123 | Army | Navy | spreads | William Hill (US) | 2022-12-09T06:41:22Z | Army Black Knights | -110 | 2.5 |
| 3 | 123 | Army | Navy | spreads | William Hill (US) | 2022-12-09T06:41:22Z | Navy Midshipmen | -110 | -2.5 |
'''
注意:你的输出不清楚。这就是为什么我添加了完整的输出