我有以下代码,在对数据帧df
进行分组后计算一些聚合:
df_count = df.groupby(['id','target']).size().reset_index(name='counts')
df_gp = df.groupby(['id','target'])
.apply(lambda x: pd.Series({'min_duration': min(x['duration']),
'max_duration': max(x['duration']),
'total_duration':sum(x['duration']),
'all_status':list(x['status']),
'last_status':list(x['status'])[-1],
'all_src':list(x['src'])
})).reset_index()
df_update = pd.merge(df_count, df_gp, on = ['id', 'target'], how = 'left')
代码工作正常,但我想知道我是否可以将count
函数直接放在df_gp
中,而不是创建一个单独的数据框然后合并?谢谢!
是的,你可以
df_gp = df.groupby(['id','target'])
.apply(lambda x: pd.Series({'min_duration': min(x['duration']),
'max_duration': max(x['duration']),
'total_duration':sum(x['duration']),
'all_status':list(x['status']),
'last_status':list(x['status'])[-1],
'all_src':list(x['src']),
'count':len(x['src'])# adding len here
})).reset_index()