使用两层排序合并DataFrame



我有两个DataFrame,一个拥有值(可以在每个轴上升序或降序排序(,另一个包含排序后的DataFrame的每个索引的Subtotals和Totals。

我想在其对应的每个索引的末尾(或开头(插入df_total,而不影响值的初始排序(图中的[True, False](。

import pandas, numpy
df_values = pandas.DataFrame({"Index1" : [1, 1, 2, 2], "Index2" : ["a", "b", "a", "c"], "Values" : [4, 5, 6, 8]}).set_index(["Index1", "Index2"])
df_totals = pandas.DataFrame({"Index1" : [1, 2, "Grand Total"], "Index2" : ["Total 1", "Total 2", "Grand Total"], "Values" : [9, 14, 23]}).set_index(["Index1", "Index2"])
df_sorted = df_values.sort_values(by=["Index1", "Index2"], ascending = [True, False])
df_to_print = pandas.concat([df_sorted_values,df_totals])
#output
=> df_values = 
Values
Index1 Index2        
1      b            5
a            4
2      c            8
a            6

=> df_totals = 
Values
Index1      Index2             
1           Total 1           9
2           Total2           14
Grand Total Grand Total      23

=> df_to_print = 
Values
Index1      Index2             
1           b                 5
a                 4
2           c                 8
a                 6
1           Total 1           9
2           Total2           14
Grand Total Grand Total      23

所需输出为:用于上行完成

=> df_to_print = 
Values
Index1      Index2             
1           b                 5
a                 4
Total 1           9
2           c                 8
a                 6
Total2           14
Grand Total Grand Total      23

下行完成

=> df_to_print = 
Values
Index1      Index2             
Grand Total Grand Total      23
1           Total 1           9
b                 5
a                 4
2           Total2           14
c                 8
a                 6

您可以通过添加新索引来实现这两个目标,如下所示:

# create new indices values
l3 = df_to_print.reset_index(level=0)['Index1'].isin((1, 2))
l4 = df_to_print['Values']
# set the indices
df_to_print = df_to_print.set_index([l3, l4], append=True)
res = df_to_print.sort_index(level=[2, 0], ascending=[False, True]).reset_index(level=[2, 3], drop=True)
print(res)

输出(升序完成(

Values
Index1      Index2             
1           b                 5
a                 4
Total 1           9
2           c                 8
a                 6
Total 2          14
Grand Total Grand Total      23

对于下降的,执行:

res = df_to_print.sort_index(level=[2, 0, 3], ascending=[True, True, False]).reset_index(level=[2, 3], drop=True)
print(res)

输出(下降完成(

Values
Index1      Index2             
Grand Total Grand Total      23
1           Total 1           9
b                 5
a                 4
2           Total 2          14
c                 8
a                 6

最新更新