我有这个数据框架:
Type cluster level value
0 Accomodation 0-1 € pr.increase from_price 0.926047
1 Accomodation 0-1 € pr.increase from_vol -0.367787
2 Accomodation 0-1 € pr.increase RIC_from_Vol 561655.141824
3 Accomodation 0-1 € pr.increase RIC_from_Price 96439.028176
4 Accomodation 1-2 € pr.increase from_price 1.687742
5 Accomodation 1-2 € pr.increase from_vol -0.264432
6 Accomodation 1-2 € pr.increase RIC_from_Vol 248475.517577
7 Accomodation 1-2 € pr.increase RIC_from_Price 68894.222423
...
我想获得以下内容:
Type cluster level value RIC
0 Accomodation 0-1 € pr.increase from_price 0.926047 96439.028176
1 Accomodation 0-1 € pr.increase from_vol -0.367787 561655.141824
4 Accomodation 1-2 € pr.increase from_price 1.687742 68894.222423
5 Accomodation 1-2 € pr.increase from_vol -0.264432 248475.517577
...
即:取出level = RIC_from_Vol,并将其值放在名为RIC的新列中,位于level = from_vol的同一行。我该怎么做呢?
我已经尝试了unstack,但只有错误…
您需要在透视前创建临时列
grp = ['Type','cluster','level']
condition = df.level.str.startswith('from')
(df
.assign(
header = np.where(condition, 'value', 'RIC'),
level = np.where(condition, df.level, df.level.str.removeprefix('RIC_'))
)
.assign(
level = lambda df: df.level.str.lower(),
uniques = lambda df: df.groupby(grp + ['header']).cumcount()
)
.pivot(index=['uniques'] + grp, columns='header',values='value')
.rename_axis(columns=None)
.droplevel('uniques')
)
RIC value
Type cluster level
Accomodation 0-1 € pr.increase from_price 96439.028176 0.926047
from_vol 561655.141824 -0.367787
1-2 € pr.increase from_price 68894.222423 1.687742
from_vol 248475.517577 -0.264432