我在这样的CSV中具有数据:
一个月AZ-Phoenix Ca-Los Angeles Ca-san Diego Ca-san Francisco Co-denver DC-Washington 1987年1月59.33 54.67 46.61 50.20 1987年2月59.65 54.89 46.87 49.96 64.77
我想将其转换为4列CSV,而不是x列,例如:
一个月年度价值观 1987年1月,AZ-PHOENIX 1987年1月CA-LOS ANGELES 59.33 1987年1月Ca-san迭戈54.67 1987年1月Ca-san Francisco 46.61 1987年1月联合丹佛50.20 .....
到目前为止,代码仅适用于1列,不能将其推送到2列。如何在我们枢纽状态和价值观的同时保持月和年的恒定并增加?
到目前为止的代码:
df = df.set_index('YEAR').stack(dropna=False).reset_index()
df.columns = ['YEAR','A','B']
不能在某个地方添加一个月并获得这个?
您只需将要保存的列添加到索引,堆叠,然后重置索引即可。
df.set_index(['Month','YEAR']).stack(dropna=False).reset_index()
演示
>>> df
Month YEAR AZ-Phoenix CA-Los Angeles CA-San Diego CA-San.1
0 January 1987 59.33 54.67 46.61 50.20 NaN NaN
1 February 1987 59.65 54.89 46.87 49.96 64.77 NaN
Francisco CO-Denver DC-Washington
0 NaN NaN NaN
1 NaN NaN NaN
>>> df.set_index(['Month','YEAR']).stack(dropna=False).reset_index()
Month YEAR level_2 0
0 January 1987 AZ-Phoenix 59.33
1 January 1987 CA-Los 54.67
2 January 1987 Angeles 46.61
3 January 1987 CA-San 50.20
4 January 1987 Diego NaN
5 January 1987 CA-San.1 NaN
6 January 1987 Francisco NaN
7 January 1987 CO-Denver NaN
8 January 1987 DC-Washington NaN
9 February 1987 AZ-Phoenix 59.65
10 February 1987 CA-Los 54.89
11 February 1987 Angeles 46.87
12 February 1987 CA-San 49.96
13 February 1987 Diego 64.77
14 February 1987 CA-San.1 NaN
15 February 1987 Francisco NaN
16 February 1987 CO-Denver NaN
17 February 1987 DC-Washington NaN
您可以使用 pd.melt()
基本上反向枢轴,但是订单的出现并不完全相同,因此如果订单很重要,则需要对其进行排序:
>>> pd.melt(df, id_vars=['Month', 'YEAR'], var_name='State')
Month YEAR State value
0 January 1987 AZ-Phoenix 59.33
1 February 1987 AZ-Phoenix 59.65
2 January 1987 CA-Los Angeles 54.67
3 February 1987 CA-Los Angeles 54.89
4 January 1987 CA-San Diego 46.61
...