是否可以在Python/Pandas中对每个groupby结果进行分组和排序?
试图在以下给定的数据中找到每年排名第一的一代。使用groupby和sort尝试了很多,但没有成功。
Input::
release_year Genere Count
1997 Action 46
1997 Adventure 7
1997 Animation 2
1997 Children's 12
1997 Comedy 73
1997 Crime 22
1997 Documentary 6
1997 Drama 81
1997 Horror 6
1997 Mystery 5
1997 Romance 15
1997 Sci-Fi 1
1997 Thriller 9
1997 War 1
1998 Action 12
1998 Adventure 2
1998 Comedy 24
1998 Crime 6
1998 Documentary 2
1998 Drama 21
1998 Film-Noir 2
1998 Horror 3
1998 Romance 4
1998 Thriller 1
Expected Output:
release_year Genere Count
1997 Drama 81
1998 Comedy 24
这是我的解决方案:
df_res = df.groupby(['release_year','Genere'])['Count'].sum().reset_index()
indexs = []
for year in df_res['release_year'].unique():
df_temp = df_res[df_res.release_year==year]
indexs += list(df_temp[df_temp['Count']==df_temp['Count'].max()].index)
df_res = df_res.loc[indexs]
我不使用"sort(("。我改为使用"max(("。
尝试:
df=df.sort_values("Count", ascending=False)
df.groupby("release_year").first()
输出:
Genere Count
release_year
1997 Drama 81
1998 Comedy 24