在pandas中的某些特定列上添加具有新值的新行



假设我们有一个如下所示的表:

<表类> id week_num 人 日期 水平 b tbody><<tr>112019901011231230199010812313401990115123151001990129123171001990212123

这里假设每个id的日期从1990-01-01重新开始:

import itertools
# reindex to get all combinations of ids and week numbers
df_full = (df.set_index(["id", "week_num"])
.reindex(list(itertools.product([1,2], range(1, 11))))
.reset_index())
# fill people with zero
df_full = df_full.fillna({"people": 0})
# forward fill some other columns
cols_ffill = ["level", "a", "b"]
df_full[cols_ffill] = df_full[cols_ffill].ffill()
# reconstruct date from week starting from 1990-01-01 for each id
df_full["date"] = pd.to_datetime("1990-01-01") + (df_full.week_num - 1) * pd.Timedelta("1w")
df_full
# out:
id  week_num  people       date  level    a    b
0    1         1    20.0 1990-01-01    1.0  2.0  3.0
1    1         2    30.0 1990-01-08    1.0  2.0  3.0
2    1         3    40.0 1990-01-15    1.0  2.0  3.0
3    1         4     0.0 1990-01-22    1.0  2.0  3.0
4    1         5   100.0 1990-01-29    1.0  2.0  3.0
5    1         6     0.0 1990-02-05    1.0  2.0  3.0
6    1         7   100.0 1990-02-12    1.0  2.0  3.0
7    1         8     0.0 1990-02-19    1.0  2.0  3.0
8    1         9     0.0 1990-02-26    1.0  2.0  3.0
9    1        10     0.0 1990-03-05    1.0  2.0  3.0
10   2         1     0.0 1990-01-01    1.0  2.0  3.0
11   2         2     0.0 1990-01-08    1.0  2.0  3.0
12   2         3     0.0 1990-01-15    1.0  2.0  3.0
13   2         4     0.0 1990-01-22    1.0  2.0  3.0
14   2         5     0.0 1990-01-29    1.0  2.0  3.0
15   2         6     0.0 1990-02-05    1.0  2.0  3.0
16   2         7     0.0 1990-02-12    1.0  2.0  3.0
17   2         8     0.0 1990-02-19    1.0  2.0  3.0
18   2         9     0.0 1990-02-26    1.0  2.0  3.0
19   2        10     0.0 1990-03-05    1.0  2.0  3.0

最新更新