iterables = [[0,1], ["l0", "l1", 'l2']]
df = pd.DataFrame(np.random.randn(4, 6))
df.columns = pd.MultiIndex.from_product(iterables, names=["first", "second"])
df看起来像
first 0 1
second l0 l1 l2 l0 l1 l2
0 0.906053 0.398538 -1.094092 -0.287806 0.134784 -2.274308
1 -0.594186 0.496815 2.162056 -0.963402 -0.520975 1.485731
2 0.936087 -0.868435 0.660429 0.204297 -0.791717 1.186809
3 2.672840 -0.994447 0.544952 -0.109592 2.698411 -2.294639
我想把它转换成一些看起来像的东西
first second l0 l1 l2
0 0.906053 0.398538 -1.094092
-0.594186 0.496815 2.162056
0.936087 -0.868435 0.660429
1 2.672840 -0.994447 0.544952
-0.287806 0.134784 -2.274308
-0.963402 -0.520975 1.485731
2 -0.963402 -0.520975 1.485731
0.204297 -0.791717 1.186809
-0.109592 2.698411 -2.294639
基本上,我想根据多索引的第一级将数据帧划分为多个组,然后垂直连接它们。。如果我做
df.unstack(level=0)
那么它将变为
first second
0 l0 0 0.906053
1 -0.594186
2 0.936087
3 2.672840
l1 0 0.398538
1 0.496815
2 -0.868435
3 -0.994447
l2 0 -1.094092
1 2.162056
2 0.660429
3 0.544952
1 l0 0 -0.287806
1 -0.963402
2 0.204297
3 -0.109592
l1 0 0.134784
1 -0.520975
2 -0.791717
3 2.698411
l2 0 -2.274308
1 1.485731
2 1.186809
3 -2.294639
和
df.stack(level=0)
返回
second l0 l1 l2
first
0 0 0.906053 0.398538 -1.094092
1 -0.287806 0.134784 -2.274308
1 0 -0.594186 0.496815 2.162056
1 -0.963402 -0.520975 1.485731
2 0 0.936087 -0.868435 0.660429
1 0.204297 -0.791717 1.186809
3 0 2.672840 -0.994447 0.544952
1 -0.109592 2.698411 -2.294639
其中该行没有被相应地索引。希望我说清楚了!
IIUC,您需要堆叠,然后是swaplevel和sort_index
df.stack(level=0).swaplevel().sort_index()
second l0 l1 l2
first
0 0 -0.558788 0.594744 1.065517
1 -0.802450 0.699085 -1.900555
2 1.230134 0.846437 -0.423816
3 -1.502994 1.177891 -0.066557
1 0 -2.302171 -0.229552 -0.797856
1 0.461626 -0.832619 1.960905
2 0.122817 1.010266 -1.609697
3 -0.605909 -0.069873 0.777179
我设法通过实现了它
df.stack(level=0).swaplevel(i=0, j=1, axis = 0).reset_index().sort_values(['first'], ascending=True)