覆盖/移动特定列(熊猫)中的值从底部行到上行



我有一个如下所示的数据帧,并希望从下半部分移动"电话"、"spotify"和"租金"的值并覆盖上半部分(基本上将数据帧一分为二并将"费用"值放在"收入"一半。

目前,有1月或12月两次。我希望它只有 12 行,每个单元格中都有值(即没有单元格的值为 0.0(。

loan      csn  salary  phone  spotify    rent
january   income   1200.0  13000.0  2000.0    0.0      0.0     0.0
february  income   1200.0  13000.0  2000.0    0.0      0.0     0.0
march     income   1200.0  13000.0  2000.0    0.0      0.0     0.0
april     income   1200.0  13000.0  2000.0    0.0      0.0     0.0
may       income   1200.0  13000.0  2000.0    0.0      0.0     0.0
june      income   1200.0  13000.0  2000.0    0.0      0.0     0.0
july      income   1200.0  13000.0  2000.0    0.0      0.0     0.0
august    income   1200.0  13000.0  2000.0    0.0      0.0     0.0
september income   1200.0  13000.0  2000.0    0.0      0.0     0.0
october   income   1200.0  13000.0  2000.0    0.0      0.0     0.0
november  income   1200.0  13000.0  2000.0    0.0      0.0     0.0
december  income   1200.0  13000.0  2000.0    0.0      0.0     0.0
january   expense     0.0      0.0     0.0  300.0     49.0  3500.0
february  expense     0.0      0.0     0.0  300.0    149.0  3500.0
march     expense     0.0      0.0     0.0  300.0     49.0  3500.0
april     expense     0.0      0.0     0.0  300.0     49.0  3500.0
may       expense     0.0      0.0     0.0  300.0     49.0  3500.0
june      expense     0.0      0.0     0.0  300.0     49.0  3500.0
july      expense     0.0      0.0     0.0  300.0     49.0  3500.0
august    expense     0.0      0.0     0.0  300.0     49.0  3500.0
september expense     0.0      0.0     0.0  300.0     49.0  3500.0
october   expense     0.0      0.0     0.0  300.0     49.0  3500.0
november  expense     0.0      0.0     0.0  300.0     49.0  3500.0
december  expense     0.0      0.0     0.0  300.0     49.0  3500.0

从 获取数据。杰森:

df_all = pd.DataFrame.from_dict({(i,j): data[i][j] 
for i in data.keys() 
for j in data[i].keys()},
orient='index')

.JSON 文件结构:

{
"january": {
"income": {
"loan": 1200,
"csn": 13000,
"salary": 2000
},
"expense": {
"phone": 300,
"spotify": 49,
"rent": 3500
}
...

期望输出:

loan      csn  salary  phone  spotify    rent
january   income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
february  income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
march     income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
april     income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
may       income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
june      income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
july      income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
august    income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
september income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
october   income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
november  income   1200.0  13000.0  2000.0  300.0     49.0  3500.0
december  income   1200.0  13000.0  2000.0  300.0     49.0  3500.0

这是一种方法:

df = df.rename(index={'expense':'income'}, level=1).fillna(0).groupby(level=[0,1]).sum()
df

输出:

loan    csn  Salary  phone  spotify  rent
Apr income  1200  13000  2000.0    300       49  3500
Aug income  1200  13000  2000.0    300       49  3500
Dec income  1200  13000  2000.0    300       49  3500
Feb income  1200  13000  2000.0    300       49  3500
Jan income  1200  13000  2000.0    300       49  3500
Jul income  1200  13000  2000.0    300       49  3500
Jun income  1200  13000  2000.0    300       49  3500
Mar income  1200  13000  2000.0    300       49  3500
May income  1200  13000  2000.0    300       49  3500
Nov income  1200  13000  2000.0    300       49  3500
Oct income  1200  13000  2000.0    300       49  3500
Sep income  1200  13000  2000.0    300       49  3500

详:

重命名索引级别 1,使"费用"变为"收入",然后groupby索引的两个级别。 我们可以使用first但我不认为未来证明和安全,因此,我选择fillna零和sum

最新更新