我有一个这样的数据帧,
param per per_date per_num
0 XYZ 1.0 2018-10-01 11.0
1 XYZ 2.0 2017-08-01 15.25
2 XYZ 3.0 2019-10-01 11.25
3 MMG 1.0 2019-08-01 15.71
4 MMG 2.0 2020-10-01 11.50
5 MMG 3.0 2021-10-01 11.75
6 MMG 4.0 2014-01-01 14.00
我想要这样的输出,
param per_1 per_2 per_3 per_4 per_date_1 per_date_2 per_date_3 per_date_4 per_num_1 per_num_2 per_num_3 per_num_4
0 XYZ 1 2 3 NaN 2018-10-01 2017-08-01 2019-10-01 NaN 11.0 15.25 11.25 NaN
1 MMG 1 2 3 4 2019-08-01 2020-10-01 2021-10-01 2014-01-01 15.71 11.50 11.75 14.00
我尝试了以下方法,
df.vstack().reset_index().drop('level_1',axis=0)
这没有给我所需的输出。
如果您看到,我有per
列,它具有增量值,当我转置它们时,这些值可以进入列名。
任何建议都会很棒。
使用GroupBy.cumcount
表示计数器并按DataFrame.unstack
重新塑造形状,最后将列名平展f-string
s:
df = df.set_index(['param', df.groupby('param').cumcount().add(1)]).unstack()
df.columns = [f'{a}_{b}' for a, b in df.columns]
df = df.reset_index()
print (df)
param per_1 per_2 per_3 per_4 per_date_1 per_date_2 per_date_3
0 MMG 1.0 2.0 3.0 4.0 2019-08-01 2020-10-01 2021-10-01
1 XYZ 1.0 2.0 3.0 NaN 2018-10-01 2017-08-01 2019-10-01
per_date_4 per_num_1 per_num_2 per_num_3 per_num_4
0 2014-01-01 15.71 11.50 11.75 14.0
1 NaN 11.00 15.25 11.25 NaN