如何根据之前的行和列条件填充pandas数据框的行?



我有一个数据框架:

ID  2000-01 2000-02 2000-03 2001-01 2001-02 val
1   2847    2861    2875    2890    2904    94717
2   1338    1343    1348    1353    1358    70105
3   3301    3311    3321    3331    3341    60307
4   1425    1422    1419    1416    1413    79888

我想在表中添加一个新行,表示当前年份与去年的差值,例如:"2001-01"——"2000 - 01 -">

输出:

ID  2000-01 2000-02 2000-03 2001-01 2001-02 val
1   2847    2861    2875    2890    2904    94717
2   1338    1343    1348    1353    1358    70105
3   3301    3311    3321    3331    3341    60307
4   1425    1422    1419    1416    1413    79888
5   NaN     NaN     NaN     -9      -9      NaN

如何在不硬编码列标题的情况下选择上一年的列名?

下面的代码将满足您的要求。"if"条件可以修改,以便它可以检测包含年份的更好的列。目前,它只会在我们分手后检查"-"。如果结果长度等于"2">

import pandas as pd
import math

df=pd.DataFrame({"ID" :[ 1,2,3,4],
"2000-01":[2847,1338,3301,1425  ],
"2000-02":[2861,1343,3311,1422  ],
"2000-03":[2875,1348,3321,1419  ],
"2001-01":[2890,1353,3331,1416  ],
"2001-02":[2904,1358,3341,1413  ],
"val" :[94717,70105,60307,79888 ]})
#setting index
df=df.set_index("ID")
#creating a dictionary that will serve so pick what is the previous year
ly_dict={}
#making a list of the columns 
mylist=df.columns.copy()
#two lists for internal storage
myempty_list=[]
usable_cols=[]
for item in mylist:
#getting the year
ha=item.split("-")

if (len(ha) == 2 ):
ly=str(int(ha[0])-1)+"-"+ha[1]
myempty_list.append(ly)
usable_cols.append(item)
#filling the last year dictionary
ly_dict[item]=ly

combined_list=list(set(list(mylist)+myempty_list))
df=df.reindex(columns=combined_list)

last_row_id=df.shape[0]+1
df.loc[last_row_id] = [math.nan for item in range(df.shape[1])]

for item in usable_cols:
try:
df.loc[last_row_id,item]=df.loc[last_row_id-1,item]-df.loc[last_row_id-1,ly_dict[item]]
except:
pass
df=df.reindex(columns=mylist)

最新更新