ValueError:值的长度与长度索引不匹配


import pandas as pd
dict1 = {'id_game': [112, 113, 114], 'game_name' : ['x','z','y'],'id_category':[1,2,3], 'id_players':[[588,589,590],[589],[588,589]]}
dict2 = {'id_player': [588, 589, 590],'player_name' : ['fff','aaa','ccc'] ,'indication':['mmm x ggg sdg y', 'uuu x fdb y kfnkjq z', 'fffre x']}
game_df = pd.DataFrame(dict1)
player_df = pd.DataFrame(dict2)

这是我的数据示例,我正在寻找一种解决方案,根据game_df['id_players']player_df['id_player']game_df['game_name']drug_df['indication']之间的关系,在第二个数据帧game_df中获得包含categories_id的列

在以下脚本中,我使用了game_nameindication值:

new_list = []
for i in range(len(game_df)):
for j in range(len(player_df)):
if game_df['game_name'][i] in player_df['indication'][j]:
new_list.append(game_df['id_category'][i])
print(new_list)

player_df['categories_id'] = new_list 

错误:

--> 747         raise ValueError(
748             "Length of values "
749             f"({len(data)}) "
ValueError: Length of values (6) does not match length of index (3)

您的代码可以通过在print(new_list)之后添加break来修复相同的压痕。

...
if game_df['game_name'][i] in player_df['indication'][j]:
new_list.append(game_df['id_category'][i])
print(new_list)
break

也就是说,对数据帧进行迭代是非常不鼓励的,因为它很慢,而且很快就会变得笨拙。解决此类问题的规范方法是mergeid_player(s)上的数据帧,即将id_players中的id分解为单独的行,

>>> game_df = game_df.explode("id_players").rename(columns={"id_players": "id_player"})
>>> game_df
id_game game_name  id_category id_player
0      112         x            1       588
0      112         x            1       589
0      112         x            1       590
1      113         z            2       589
2      114         y            3       588
2      114         y            3       589

所以你可以用game_df、来.merge

>>> df = game_df.merge(player_df, on="id_player")
>>> df
id_game game_name  id_category id_player player_name            indication
0      112         x            1       588         fff       mmm x ggg sdg y
1      114         y            3       588         fff       mmm x ggg sdg y
2      112         x            1       589         aaa  uuu x fdb y kfnkjq z
3      113         z            2       589         aaa  uuu x fdb y kfnkjq z
4      114         y            3       589         aaa  uuu x fdb y kfnkjq z
5      112         x            1       590         ccc               fffre x

这将使分析变得相当简单,比如检查game_name是否在indication中是否成为

df.apply(lambda row: row.game_name in row.indication, axis=1)

尽管它对所有这些都返回True,所以我不确定这是否真的是你想要的。

最新更新