我有一个如下所示的数据帧,并希望从下半部分移动"电话"、"spotify"和"租金"的值并覆盖上半部分(基本上将数据帧一分为二并将"费用"值放在"收入"一半。
目前,有1月或12月两次。我希望它只有 12 行,每个单元格中都有值(即没有单元格的值为 0.0(。
loan csn salary phone spotify rent
january income 1200.0 13000.0 2000.0 0.0 0.0 0.0
february income 1200.0 13000.0 2000.0 0.0 0.0 0.0
march income 1200.0 13000.0 2000.0 0.0 0.0 0.0
april income 1200.0 13000.0 2000.0 0.0 0.0 0.0
may income 1200.0 13000.0 2000.0 0.0 0.0 0.0
june income 1200.0 13000.0 2000.0 0.0 0.0 0.0
july income 1200.0 13000.0 2000.0 0.0 0.0 0.0
august income 1200.0 13000.0 2000.0 0.0 0.0 0.0
september income 1200.0 13000.0 2000.0 0.0 0.0 0.0
october income 1200.0 13000.0 2000.0 0.0 0.0 0.0
november income 1200.0 13000.0 2000.0 0.0 0.0 0.0
december income 1200.0 13000.0 2000.0 0.0 0.0 0.0
january expense 0.0 0.0 0.0 300.0 49.0 3500.0
february expense 0.0 0.0 0.0 300.0 149.0 3500.0
march expense 0.0 0.0 0.0 300.0 49.0 3500.0
april expense 0.0 0.0 0.0 300.0 49.0 3500.0
may expense 0.0 0.0 0.0 300.0 49.0 3500.0
june expense 0.0 0.0 0.0 300.0 49.0 3500.0
july expense 0.0 0.0 0.0 300.0 49.0 3500.0
august expense 0.0 0.0 0.0 300.0 49.0 3500.0
september expense 0.0 0.0 0.0 300.0 49.0 3500.0
october expense 0.0 0.0 0.0 300.0 49.0 3500.0
november expense 0.0 0.0 0.0 300.0 49.0 3500.0
december expense 0.0 0.0 0.0 300.0 49.0 3500.0
从 获取数据。杰森:
df_all = pd.DataFrame.from_dict({(i,j): data[i][j]
for i in data.keys()
for j in data[i].keys()},
orient='index')
.JSON 文件结构:
{
"january": {
"income": {
"loan": 1200,
"csn": 13000,
"salary": 2000
},
"expense": {
"phone": 300,
"spotify": 49,
"rent": 3500
}
...
期望输出:
loan csn salary phone spotify rent
january income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
february income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
march income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
april income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
may income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
june income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
july income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
august income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
september income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
october income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
november income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
december income 1200.0 13000.0 2000.0 300.0 49.0 3500.0
这是一种方法:
df = df.rename(index={'expense':'income'}, level=1).fillna(0).groupby(level=[0,1]).sum()
df
输出:
loan csn Salary phone spotify rent
Apr income 1200 13000 2000.0 300 49 3500
Aug income 1200 13000 2000.0 300 49 3500
Dec income 1200 13000 2000.0 300 49 3500
Feb income 1200 13000 2000.0 300 49 3500
Jan income 1200 13000 2000.0 300 49 3500
Jul income 1200 13000 2000.0 300 49 3500
Jun income 1200 13000 2000.0 300 49 3500
Mar income 1200 13000 2000.0 300 49 3500
May income 1200 13000 2000.0 300 49 3500
Nov income 1200 13000 2000.0 300 49 3500
Oct income 1200 13000 2000.0 300 49 3500
Sep income 1200 13000 2000.0 300 49 3500
详:
重命名索引级别 1,使"费用"变为"收入",然后groupby
索引的两个级别。 我们可以使用first
但我不认为未来证明和安全,因此,我选择fillna
零和sum
。