Name,abc
Title,teacher
Email,abc.edu
Phone,000-000-0000
Office,21building
About,"abc is teacher"
Name,def
Title,plumber
Email,plumber@plumber.com
Phone,111-111-1111
Office,22building
About,"The best plumber in the town"
Name,ghi
Title,producer
Phone,333-333-3333
Office,33building
About,"The best producer"
我将使用pandas
库读取.csv(本例中为foo.csv(数据,然后使用to_json
将其转换为json。
在这种情况下,你有一本字典
import pandas as pd
pd.read_csv('aaa.csv', header=None, index_col=0, squeeze=True)
.to_json(orient='columns')
如果您想导出.json文件
import pandas as pd
with open('exported_file.json', 'w') as f:
pd.read_csv('foo.csv', header=None, index_col=0, squeeze=True)
.to_json(f, orient='columns')
我假设CSV文件包含一个关于个人的连续记录,格式为"Label,Value"
,您希望在每个人的分隔记录中重新组织它,并在第二维度上用标签作为列名。输出将被存储为JSON文件。
如果是这种情况,那么我们可以使用pandas.DataFrame.pivot
来更改数据结构。但在此之前,我们必须按个人对标签进行分组。为此,我假设Name
标签对每个人都是强制性的,每个唯一的标签在名字之间最多出现一次:
data = '''Name,abc
Title,teacher
Email,abc.edu
Phone,000-000-0000
Office,21building
About,"abc is teacher"
Name,def
Title,plumber
Email,plumber@plumber.com
Phone,111-111-1111
Office,22building
About,"The best plumber in the town"
Name,ghi
Title,producer
Phone,333-333-3333
Office,33building
About,"The best producer"'''
df = pd.read_csv(StringIO(data), names=['label','value'])
df['grouper'] = (df['label'] == 'Name').cumsum()
df = df.pivot(index='grouper', columns='label', values='value')
有了这些数据,我们可以将其保存为:
df.to_json('test.json', orient='records', lines=True)