如果两个单元格相同但顺序不同,则对两行求和



类似于下面

Buyer Seller Amount
John  Mary   3
Mary  John   2
David Bosco  2

我想将约翰和玛丽行相加为一的地方

预计会出来

Trade1 Trade2 Amount
John   Mary   5
David  Bosco  2

我的数据帧大约有 6000 行。谢谢你的帮助

首先按numpy.sort对值进行排序,按DataFrame.duplicated创建布尔掩码,然后聚合sum

df[['Buyer','Seller']] = pd.DataFrame(np.sort(df[['Buyer','Seller']], axis=1))
df2 = df.groupby(['Buyer','Seller'], as_index=False)['Amount'].sum()
df2.columns = ['Trade1','Trade2','Amount']
print (df2)
Trade1 Trade2  Amount
0  Bosco  David       2
1   John   Mary       5

如果不想修改原始列,请使用语法糖 -groupbySeries

df1 = pd.DataFrame(np.sort(df[['Buyer','Seller']], axis=1))
df1.columns = ['Trade1','Trade2']
df2 = df['Amount'].groupby([df1['Trade1'],df1['Trade2']]).sum().reset_index()
print (df2)
Trade1 Trade2  Amount
0  Bosco  David       2
1   John   Mary       5

最新更新