当我将数据帧从熊猫导出到 excel 电子表格时,我的列顺序如下所示,其中"10 个最大事件"被认为是"1 个最大事件"之后的下一个,而不是"2 个最大事件"。我希望它按数字顺序显示。即"1 个最大的事件"、"2 个最大的事件"、"10 个最大的事件">
ID_1 Permit No. ID_2 1 Largest Event 10 Largest Event 2 Largest Event
10220 To Be Permitted 0010001-24.1 4.0548 0.822 3.9611
为什么会这样?这是一个很小的格式错误,但它可能非常碍眼。
来自natsort
与reindex
from natsort import natsorted
l=['1 Largest Event','10 Largest Event','2 Largest Event']
natsorted(l)
Out[789]: ['1 Largest Event', '2 Largest Event', '10 Largest Event']
df=df.reindex(columns=natsorted(list(df)))
问题是您的列按字典顺序像字符串一样排序。
所以需要使用自定义函数进行排序,将第一个拆分的值转换为int
s:
df = df[sorted(df.columns, key=lambda x: int(x.split()[0]))]
示例:
cols = ['1 Largest Event',
'10 Largest Event',
'2 Largest Event',
'3 Largest Event',
'4 Largest Event',
'5 Largest Event',
'6 Largest Event',
'7 Largest Event',
'8 Largest Event',
'9 Largest Event']
df = pd.DataFrame(0, columns=cols, index=[0])
print (df)
1 Largest Event 10 Largest Event 2 Largest Event 3 Largest Event
0 0 0 0 0
4 Largest Event 5 Largest Event 6 Largest Event 7 Largest Event
0 0 0 0 0
8 Largest Event 9 Largest Event
df = df[sorted(df.columns, key=lambda x: int(x.split()[0]))]
print (df)
1 Largest Event 2 Largest Event 3 Largest Event 4 Largest Event
0 0 0 0 0
5 Largest Event 6 Largest Event 7 Largest Event 8 Largest Event
0 0 0 0 0
9 Largest Event 10 Largest Event
0 0 0
编辑:
您还可以筛选最后 3 列以进行排序:
df = df[df.columns[:3].tolist() + sorted(df.columns[3:], key=lambda x: int(x.split()[0]))]
print (df)
ID_1 Permit No. ID_2 1 Largest Event 2 Largest Event
0 10220 To Be Permitted 0010001-24.1 4.0548 3.9611
10 Largest Event
0 0.822