在panda中融化多索引数据帧



我一直被这种多级股价数据框架所困扰,我正试图将其从宽数据融化为长数据。

我想从这里开始:

Attributes    Close   Close     High          
Symbols        AMZN    ^DJI     AMZN 
Date                                                                 
2020-12-01       32     29         35 
2020-11-30       31     28         36 
2020-11-27       30     27         37  

到此:

Attri   Sym        Date     price
0     Close  AMZN  2020-12-01        32
1     Close  AMZN  2020-11-30        31
2     Close  AMZN  2020-11-27        30
3     Close  ^DJI  2020-12-01        29
4     Close  ^DJI  2020-11-30        28
5     High   AMZN  2020-12-01        35
6     ....

我试过了:

df = df.reset_index()
df = df.set_index([('Date', '')]).rename_axis(index=None, columns=('Date', ''))
df = df.fillna('').set_index('Date').T
.set_index('',append=True).stack().reset_index()

但我不明白。你知道我还应该试试什么吗?

对于我使用Series.reset_index:通过两个级别工作DataFrame.stack

df = df.stack([0,1]).reset_index(name='price')
print (df)
Date Attributes Symbols  price
0  2020-12-01      Close    AMZN   32.0
1  2020-12-01      Close    ^DJI   29.0
2  2020-12-01       High    AMZN   35.0
3  2020-11-30      Close    AMZN   31.0
4  2020-11-30      Close    ^DJI   28.0
5  2020-11-30       High    AMZN   36.0
6  2020-11-27      Close    AMZN   30.0
7  2020-11-27      Close    ^DJI   27.0
8  2020-11-27       High    AMZN   37.0

另一个想法是@sammywemmy:评论中的解决方案

df = df.melt(ignore_index=False, value_name="price").reset_index()
print (df)
Date Attributes Symbols  price
0  2020-12-01      Close    AMZN     32
1  2020-11-30      Close    AMZN     31
2  2020-11-27      Close    AMZN     30
3  2020-12-01      Close    ^DJI     29
4  2020-11-30      Close    ^DJI     28
5  2020-11-27      Close    ^DJI     27
6  2020-12-01       High    AMZN     35
7  2020-11-30       High    AMZN     36
8  2020-11-27       High    AMZN     37

最新更新