我一直被这种多级股价数据框架所困扰,我正试图将其从宽数据融化为长数据。
我想从这里开始:
Attributes Close Close High
Symbols AMZN ^DJI AMZN
Date
2020-12-01 32 29 35
2020-11-30 31 28 36
2020-11-27 30 27 37
到此:
Attri Sym Date price
0 Close AMZN 2020-12-01 32
1 Close AMZN 2020-11-30 31
2 Close AMZN 2020-11-27 30
3 Close ^DJI 2020-12-01 29
4 Close ^DJI 2020-11-30 28
5 High AMZN 2020-12-01 35
6 ....
我试过了:
df = df.reset_index()
df = df.set_index([('Date', '')]).rename_axis(index=None, columns=('Date', ''))
df = df.fillna('').set_index('Date').T
.set_index('',append=True).stack().reset_index()
但我不明白。你知道我还应该试试什么吗?
对于我使用Series.reset_index
:通过两个级别工作DataFrame.stack
df = df.stack([0,1]).reset_index(name='price')
print (df)
Date Attributes Symbols price
0 2020-12-01 Close AMZN 32.0
1 2020-12-01 Close ^DJI 29.0
2 2020-12-01 High AMZN 35.0
3 2020-11-30 Close AMZN 31.0
4 2020-11-30 Close ^DJI 28.0
5 2020-11-30 High AMZN 36.0
6 2020-11-27 Close AMZN 30.0
7 2020-11-27 Close ^DJI 27.0
8 2020-11-27 High AMZN 37.0
另一个想法是@sammywemmy:评论中的解决方案
df = df.melt(ignore_index=False, value_name="price").reset_index()
print (df)
Date Attributes Symbols price
0 2020-12-01 Close AMZN 32
1 2020-11-30 Close AMZN 31
2 2020-11-27 Close AMZN 30
3 2020-12-01 Close ^DJI 29
4 2020-11-30 Close ^DJI 28
5 2020-11-27 Close ^DJI 27
6 2020-12-01 High AMZN 35
7 2020-11-30 High AMZN 36
8 2020-11-27 High AMZN 37