我正在使用一个涉及推荐信息的数据集,以及为正在运行的每个促销活动支付的总金额。我能够获得每个促销活动中的总购买数量,以及每个促销活动的总支付金额,但我无法将两个汇总列除以,以获得获得新购买的平均支付金额。下面是数据的一个子集:
bucket receiver_quote total_paid
0 168hr 0 NaN
1 168hr 0 NaN
2 168hr 1 100.0
3 168hr 1 50.0
4 168hr 1 100.0
5 48hr 1 75.0
6 48hr 0 NaN
7 48hr 0 NaN
8 48hr 0 NaN
9 48hr 0 NaN
现在,我能得到这个表:
df.groupby('bucket').agg({'receiver_policy':'sum', 'total_paid': 'sum})
结果:
bucket | receiver_policy | total_paid | 0人力资源 | 45 | 11375.0 |
---|---|---|
168小时 | 27 | 6725.0 |
48小时 | 31日 | 7200.0 | /td> | 31日 | 4200.0 |
- 您的示例数据和示例代码不匹配列。已适应此
- 简单使用
groupby().apply()
返回一系列您想要的组 的所有计算
df = pd.read_csv(
io.StringIO(
"""bucket receiver_quote total_paid
0 168hr 0 NaN
1 168hr 0 NaN
2 168hr 1 100.0
3 168hr 1 50.0
4 168hr 1 100.0
5 48hr 1 75.0
6 48hr 0 NaN
7 48hr 0 NaN
8 48hr 0 NaN
9 48hr 0 NaN"""
),
sep="s+",
)
df.groupby("bucket").agg({"receiver_quote": "sum", "total_paid": "sum"})
df.groupby("bucket").apply(
lambda d: pd.Series(
{
"receiver_quote": d["receiver_quote"].sum(),
"total_paid": d["total_paid"].sum(),
"avg_paid_per_policy":d["total_paid"].sum() / d["receiver_quote"].sum()
}
)
)
avg_paid_per_policy250 83.3333 75