添加两个具有父列和多个子列的数据框架



我有两个数据框架。第一个数据框用于按地点和月份计算托运单的总成本。第二个是按地点和月份计算总成本(维护+燃料)。

我的代码如下:
def test():
start_date = '2022-01-01'
end_date = '2022-05-31'
###### First dataframe
# Get monthly delivery log cost
total_delivery_log_cost = monthly_delivery_log_cost_by_branch(start_date, end_date)
# Get monthly pickup cost
total_pickup_log_cost = monthly_pickup_log_cost_by_branch(start_date, end_date)
# Union total monthly cost (CN)
df = pd.concat([total_delivery_log_cost, total_pickup_log_cost])
# Pivot
df = df.pivot_table(index=['location'],columns =['report_month'], aggfunc = np.sum, fill_value=0)
df.columns = df.columns.to_flat_index().to_series().apply(lambda x: x[1])
df = df.reset_index()

df.to_csv('total_CN.csv')
###### Second dataframe
# Get monthly vehicle cost:
total_maintainence_cost = monthly_maintainence_cost_by_branch(start_date, end_date)
print(total_maintainence_cost)
# Get monthly refuel cost:
total_refuel_cost = monthly_refuel_cost_by_branch(start_date, end_date)
print(total_refuel_cost)
# Union total monthly maintainence cost
df = pd.concat([total_maintainence_cost, total_refuel_cost])
# Pivot
df = df.pivot_table(index=['location'],columns =['report_month'], aggfunc = np.sum, fill_value=0)
df.columns = df.columns.to_flat_index().to_series().apply(lambda x: x[1])
df = df.reset_index()
df.to_csv('total_cost.csv')
print(df)

print("Success")

我的第一个数据帧如下所示(总CN):

location  2022-01       2022-02   2022-03      2022-04    2022-05
ABC        22.00       24.00      60.20        55.30     66.43
XYZ        50.00       40.33      14.50        50.60     90.40
XXX        10.00       21.20      22.40        23.40     22.11 
...         ...        ...        ....          ....       .....

第二个数据框如下所示(Total Cost):

location    2022-01       2022-02   2022-03      2022-04    2022-05
ABC        30.00       33.00      5.20         65.30      12.43
XYZ        67.00       21.33      5.50         21.60      42.40
QWE        10.00       34.20      53.40        34.40      22.11 
...         ...        ...        ....          ....       .....

当我有这两个数据帧时,我想把它们组合成多索引,这是我想要实现的,所以最终的DF看起来像:

2022-01                  2022-02              .......
location          Total CN  Total Cost     Total CN    Total Cost      .......
ABC           22.00       30.00         24.00          33.00      
XYZ           50.00       67.00         40.33          21.33
XXX           10.00       0.00          21.20          0.00   
QWE           0.00        10.00          0.00          34.20               
....          ....           ....          ....           .....

我如何做到这一点?

您可以将数据框与键连接起来,以创建具有多索引的新数据框,并交换索引级别:

df_total_cn = pd.read_csv("total_cn.csv", index_col='location')
df_total_cost = pd.read_csv("total_cost.csv", index_col='location')
df = pd.concat([df_total_cn, df_total_cost], axis=1, keys=['total cn', 'total cost']).swaplevel(axis=1).sort_index('columns').fillna(0)

2022-01             2022-02             2022-03             2022-04             2022-05
total cn total cost total cn total cost total cn total cost total cn total cost total cn total cost
location
ABC          22.0       30.0    24.00      33.00     60.2        5.2     55.3       65.3    66.43      12.43
XYZ          50.0       67.0    40.33      21.33     14.5        5.5     50.6       21.6    90.40      42.40
XXX          10.0        0.0    21.20       0.00     22.4        0.0     23.4        0.0    22.11       0.00
QWE           0.0       10.0     0.00      34.20      0.0       53.4      0.0       34.4     0.00      22.11

当总列缺少时

df = pd.concat([df_total_cn, df_total_cost], axis=1, keys=['total cn', 'total cost']).swaplevel(axis=1)
for i in ['2022-01', '2022-02', '2022-03', '2022-04', '2022-05', ]:
for j in ['total cn', 'total cost']:
if j not in df[i].columns:
df.loc[:, (i, j)] = 0
df = df.sort_index('columns').fillna(0)

输出:

2022-01             2022-02             2022-03             2022-04             2022-05
total cn total cost total cn total cost total cn total cost total cn total cost total cn total cost
location
ABC          22.0          0    24.00      33.00     60.2        5.2     55.3       65.3    66.43      12.43
XYZ          50.0          0    40.33      21.33     14.5        5.5     50.6       21.6    90.40      42.40
XXX          10.0          0    21.20       0.00     22.4        0.0     23.4        0.0    22.11       0.00
QWE           0.0          0     0.00      34.20      0.0       53.4      0.0       34.4     0.00      22.11

相关内容

  • 没有找到相关文章

最新更新