我正在尝试使用df.to_csv((将数据附加到csv中。我想使用一个优雅的代码来使用它,但是出现了问题。有时,我偶然得到一本字典,里面有不同顺序的键
import pandas as pd
# Simplified version of my function
def save_to_csv(dictionary, index):
df = pd.DataFrame(dictionary, index=[index])
header = index == 0
df.to_csv('test.csv', mode='a', header=header)
# I run some function, I get dict 'dict' => I want to save it into csv file
id = 0
dict = {'col_name_1': 1, 'col_name_2': 2, 'col_name_3': 3}
save_to_csv(dict, id)
# I run some function a second time, I get dict 'dict' => I want to append it into csv file
id = 1
dict = {'col_name_2': 2, 'col_name_3': 3, 'col_name_1': 1}
save_to_csv(dict, id)
# etc ...
我得到
,col_name_1,col_name_2,col_name_3
0,1,2,3
1,2,3,1
代替
,col_name_1,col_name_2,col_name_3
0,1,2,3
1,1,2,3
我想在很长一段时间内使用这个功能,所以如果可能的话,我想避免黑客攻击,并有更多干净/稳健的解决方案
如果你有什么想法,我们将不胜感激,谢谢!
您可以根据第一个dict:找到要在csv文件中写入的密钥的顺序
dict = {'col_name_1': 1, 'col_name_2': 2, 'col_name_3': 3}
key_list = [key for key in dict.keys()]
save_to_csv(dict, id)
现在,您可以根据key_list
对其他字典键进行排序,并保存到csv文件:
dict2 = {'col_name_2': 2, 'col_name_3': 3, 'col_name_1': 1}
d = {}
In [1735]: for k in key_list:
...: if k in dict2:
...: d[k] = dict2[k]
...:
In [1736]: d
Out[1736]: {'col_name_1': 1, 'col_name_2': 2, 'col_name_3': 3}
save_to_csv(d, id)
对于你所有的dict,同样可以在一个循环中完成。这将确保您用CSV编写的dict的列顺序保持不变。
来自@Serge Ballesta的解决方案,我将用于此项目
def save_to_csv(dictionary, index):
df = pd.DataFrame(dictionary, index=[index])
header = index == 0
df.to_csv('test.csv', mode='a', header=header, columns=sorted(dictionary.keys()))
谢谢!