在Pandas DataFrame中对列中不匹配的值求和时，如何强制groupby/sum

假设我有一个数据帧"df"：

data={第一个：[杰克，杰克，泽克，泽克]，"last"：["parr"，"parr’，"smith"，"smith'"]，"方法"：["细胞"，"细胞"、"天空"、"细胞"one_answers"细胞"]，"持续时间"：[5,5,3,1]}

first     last     method    duration
jack    sparr       cell           5
jack    sparr       cell           5
zeke    smith       skype          3
zeke    smith       cell           1
zeke    smith       cell           1

我想对第一个、最后一个和方法的调用持续时间求和。我希望当"方法"不匹配时，它们是"力求和"，值为空

到目前为止，运行的类似于：

df=df.groupby(['first'，'last'，'method']，as_index=False(.sum((

将返回：

first     last     method    duration
jack    sparr       cell          10
zeke    smith       skype          3
zeke    smith       cell           2

但我选择

first     last     method    duration
jack    sparr       cell          10
zeke    smith                      5

我如何修改我的sum语句来实现这一点，或者这对熊猫来说是可能的吗？感谢

试试这样的东西：

df.groupby(['first', 'last'], as_index=False)[['method', 'duration']]
.agg({'method':lambda x: x.iloc[0] if x.nunique() == 1 else '', 
'duration':'sum'})

输出：

first   last method  duration
0  jack  sparr   cell        10
1  zeke  smith                5

好吧，让我们只按名字和姓氏分组，因为你们的总和看起来是这个级别的。对于方法，我们将聚合到一个值，如果方法列中的所有值都相同(nunique==1(，则使用"空白。

我们使用dictionary来定义每列的聚合。

相关内容

最新更新

热门标签：