如果另一列的条件为真,如何按一列分组,第三列的值用pandas求和



我想不出该怎么做:正如标题所解释的那样,只有当另一列包含Closed Won时,我才想通过列acquired_month分组数据框(在示例中,我制作了一个助手列,如果满足该条件,则仅标记True,尽管我不确定该步骤是否必要)。如果满足这些条件,我想对第三列的值求和但不知道怎么做。下面是我到目前为止的代码:

us_lead_scoring.loc[us_lead_scoring['Stage'].str.contains('Closed Won'), 'closed_won_binary'] = True acquired_date = us_lead_scoring.groupby('acquired_month')['closed_won_binary'].sum()

,但这只是对真假列求和,而不是sum列,如果真假列在acquired_month组比之后为真。如有任何指示,欢迎。

感谢

如果需要聚合列col,将Series.where中不匹配的值替换为0的值,然后聚合sum:

us_lead_scoring = pd.DataFrame({'Stage':['Closed Won1','Closed Won2','Closed', 'Won'],
'col':[1,3,5,6],
'acquired_month':[1,1,1,2]})
out = (us_lead_scoring['col'].where(us_lead_scoring['Stage']
.str.contains('Closed Won'), 0)
.groupby(us_lead_scoring['acquired_month'])
.sum()
.reset_index(name='SUM'))

print (out)
acquired_month  SUM
0               1    4
1               2    0

最新更新