蟒蛇熊猫.基于另一列的逗号分隔值中字符串列的匹配的列的总和



我有两个数据帧样本,名称已更改:

df1=

Comp_code 部门列表A 部门清单B
Code_1 ";Dept1"> ";Dept3">
Code_2 ";Dept2"> ";Dept4">
Code_3 ";Dept4、Dept5"> ";Dept1">
Code_4 ";Dept1、Dept5、Dept6"> ";Dept3、Dept4">

此代码有效。它很长,但大部分都是重复的。

new_df = df1[['Comp_code']].copy()
new_df['GrossRev'] = df1['DepartmentListB'].str.split(',').explode().map(df2.set_index('DepartmentList')['Revenue']).groupby(level=0).sum() + df1['DepartmentListA'].str.split(',').explode().map(df2.set_index('DepartmentList')['Revenue']).groupby(level=0).sum()
new_df['Tot Margin'] = df1['DepartmentListB'].str.split(',').explode().map(df2.set_index('DepartmentList')['GrossMargin']).groupby(level=0).sum() + df1['DepartmentListA'].str.split(',').explode().map(df2.set_index('DepartmentList')['GrossMargin']).groupby(level=0).sum()

输出:

>>> new_df
Comp_code  GrossRev  Tot Margin
0    Code_1      6000        1400
1    Code_2      7000         200
2    Code_3     13000         600
3    Code_4     26000        4000

最新更新