pandas:在一个 .apply 中包含聚合函数"count"(lambda x: ... )



我有以下代码,在对数据帧df进行分组后计算一些聚合:

df_count = df.groupby(['id','target']).size().reset_index(name='counts')
df_gp = df.groupby(['id','target']) 
        .apply(lambda x: pd.Series({'min_duration': min(x['duration']), 
                                    'max_duration': max(x['duration']), 
                                    'total_duration':sum(x['duration']), 
                                    'all_status':list(x['status']), 
                                    'last_status':list(x['status'])[-1], 
                                    'all_src':list(x['src'])
                                   })).reset_index()
df_update = pd.merge(df_count, df_gp, on = ['id',  'target'], how = 'left')

代码工作正常,但我想知道我是否可以将count函数直接放在df_gp中,而不是创建一个单独的数据框然后合并?谢谢!

是的,你可以

df_gp = df.groupby(['id','target']) 
        .apply(lambda x: pd.Series({'min_duration': min(x['duration']), 
                                    'max_duration': max(x['duration']), 
                                    'total_duration':sum(x['duration']), 
                                    'all_status':list(x['status']), 
                                    'last_status':list(x['status'])[-1], 
                                    'all_src':list(x['src']),
                                    'count':len(x['src'])# adding len here
                                   })).reset_index()

最新更新