我有两个数据帧样本,名称已更改:
df1
=
Comp_code | 部门列表A | 部门清单B |
---|---|---|
Code_1 | ";Dept1"> | ";Dept3"> |
Code_2 | ";Dept2"> | ";Dept4"> |
Code_3 | ";Dept4、Dept5"> | ";Dept1"> |
Code_4 | ";Dept1、Dept5、Dept6"> | ";Dept3、Dept4"> |
此代码有效。它很长,但大部分都是重复的。
new_df = df1[['Comp_code']].copy()
new_df['GrossRev'] = df1['DepartmentListB'].str.split(',').explode().map(df2.set_index('DepartmentList')['Revenue']).groupby(level=0).sum() + df1['DepartmentListA'].str.split(',').explode().map(df2.set_index('DepartmentList')['Revenue']).groupby(level=0).sum()
new_df['Tot Margin'] = df1['DepartmentListB'].str.split(',').explode().map(df2.set_index('DepartmentList')['GrossMargin']).groupby(level=0).sum() + df1['DepartmentListA'].str.split(',').explode().map(df2.set_index('DepartmentList')['GrossMargin']).groupby(level=0).sum()
输出:
>>> new_df
Comp_code GrossRev Tot Margin
0 Code_1 6000 1400
1 Code_2 7000 200
2 Code_3 13000 600
3 Code_4 26000 4000