我有一个这样的数据:
year energy_products consumption_ktoe value_ktoe
0 2009 Coal and Peat 3.0 3.300000
1 2009 Crude Oil 0.0 49079.900000
2 2009 Electricity 3338.1 3594.203691
3 2009 Natural Gas 867.8 6656.700000
4 2009 Others 0.0 0.000000
.. ... ....... ....... .........
我想将energy_product枢轴到列中,并将consumption_ktoe和value_ktoe作为每个枢轴列下的子列。
所以我想要的输出是这样的:
energy_products Coal and Peat Crude Oil
year consumption_ktoe value_ktoe consumption_ktoe value_ktoe
0 2009 3.0 3.300000 3 4
energy_products Electricity Natural Gas
year consumption_ktoe value_ktoe consumption_ktoe value_ktoe
0 2009 3.0 3.300000 3 4
energy_products Others
year consumption_ktoe value_ktoe
0 2009 3.0 3.300000
旋转之后,我得到了这个:
finalConImportMerge = finalConImportMerge.pivot(index='year', columns=['energy_products'])
consumption_ktoe
energy_products Coal and Peat Crude Oil Electricity Natural Gas Others
year
2009 3.0 0.0 3338.1 867.8 0.0
2010 5.9 0.0 3633.1 1128.3 0.0
2011 5.7 0.0 3697.9 1184.0 0.0
2012 23.0 0.0 3800.4 1126.2 0.0
2013 129.0 0.0 3864.8 1311.6 0.0
2014 162.2 0.0 3989.9 1355.1 0.0
2015 162.9 0.0 4085.5 1340.5 0.0
2016 166.8 0.0 4180.9 1360.3 0.0
2017 600.5 0.0 4268.1 1438.4 0.0
2018 182.5 0.0 4338.8 1479.4 0.0
2019 180.9 0.0 4448.0 1482.4 0.0
2020 182.7 0.0 4366.2 1638.2 0.0
value_ktoe
energy_products Petroleum Products Coal and Peat Crude Oil Electricity
year
2009 6474.8 3.3 49079.9 3594.203691
2010 7790.2 6.0 46074.3 3900.815819
2011 7614.0 6.1 47087.8 3955.235409
2012 7946.7 25.0 52191.4 4035.785689
2013 8475.0 266.4 48759.6 4124.117566
2014 8968.3 396.2 49155.7 4239.865600
2015 9993.3 406.1 50008.5 4322.578390
2016 9351.2 426.6 53410.0 4435.648008
2017 9149.6 898.6 58554.5 4490.609300
2018 9073.5 477.9 57180.3 4548.992776
2019 10125.2 458.0 54399.7 4655.398594
2020 8713.4 433.2 47047.3 4563.333718
我试着交换他们的水平
finalConImportMerge = finalConImportMerge.swaplevel(axis=1)
,但最终是这样的:
energy_products Coal and Peat Crude Oil ...... Coal and Peat Crude Oil .......
year consumption_ktoe consumption_ktoe ...... value_ktoe value_ktoe .....
2009 3 3 3 3
然后我想到了合并顶层列,但是如果我使用groupby,我必须对它求和,并且消费col和值col将被求和,这不是我想要的。
finalConImportMerge = finalConImportMerge.groupby(level=[0], axis=1)
我以这个结尾:
energy_products Coal and Peat Crude Oil Electricity Natural Gas Others
year
2009 6.3 49079.9 6932.303691 7524.5 0.0
.... .... .... .... ... ..
energy_products Petroleum Products
year
2009 96774.9
.... ....
有什么方法可以达到我想要的结果吗?或者把列分组而不求和?
你差一点就成功了。这似乎有点俗气,但我不知道除此之外还有别的方法。(我使用pandas pivot文档中的示例,因为您没有提供代码来重新创建示例DataFrame。)
import pandas as pd
df = pd.DataFrame({'foo': ['one', 'one', 'one', 'two', 'two',
'two'],
'bar': ['A', 'B', 'C', 'A', 'B', 'C'],
'baz': [1, 2, 3, 4, 5, 6],
'zoo': ['x', 'y', 'z', 'q', 'w', 't']})
pivoted_df = df.pivot(index='foo', columns='bar', values=['baz', 'zoo'])
goal_df = pivoted_df.swaplevel(axis=1)[sorted(pivoted_df.swaplevel(axis=1))]
goal_df
看起来像这样
bar A B C
baz zoo baz zoo baz zoo
foo
one 1 x 2 y 3 z
two 4 q 5 w 6 t
如何pivot,reset, unstack, swap,reset, drop,set_index, Transpose和reset
finalConImportMerge.pivot(index=['year'], columns=['energy_products']).reset_index().unstack().swaplevel(1,0).reset_index().drop('level_2',axis=1).set_index('energy_products').T.reset_index(drop=True)
Out[47]:
energy_products Coal_and_Peat Crude_Oil Electricity Natural_Gas Others Coal_and_Peat Crude_Oil Electricity Natural_Gas Others
0 year consumption_ktoe consumption_ktoe consumption_ktoe consumption_ktoe consumption_ktoe value_ktoe value_ktoe value_ktoe value_ktoe value_ktoe
1 2009.0 3.0 0.0 3338.1 867.8 0.0 3.3 49079.9 3594.203691 6656.7 0.0