将管道分隔的csv文件转换为JSON格式



我正在尝试将CSV文件转换为管道分隔为JSON文件和SetrecordID将是列表和pair将是字典列表。

CSV输入文件

SetrecordID|SetMatchScore|Pairs
100,101,102|90|"100-101,40","100-102,80","101-102,90"
103,104,105|80|"103-104,60","103-105,90","104-105,90"
106,107,108|65|"106-107,55","106-108,60","107-108,80"
109,110,111|95|"109-110,85","109-111,100","110-111,100"

期望JSON输出

[
{
Date: system date
SetrecordID:[100,101,102]
SetMatchScore:90
Pairs:[
{Pair1:100-101,matchscore:40},
{Pair2:100-102,matchscore:80},
{Pair3:101-102,matchscore:90}
]
},
{
Date: system date
SetrecordID:[103,104,105]
SetMatchScore:90
Pairs:[
{Pair1:103-104,matchscore:60},
{Pair2:103-105,matchscore:90},
{Pair3:104-105,matchscore:90}
]
}
]

获取JSON输出

{
"SetrecordID":[
"109",
"110",
"111"
],
"SetMatchScore":95,
"Pairs":"109-110,85,"109-111,100","110-111,100""
}

代码尝试

df = pd.read_csv('filename.csv',sep="|")
dict_val = {}
for index, row in df.iterrows():
row["SetrecordID"] = row["SetrecordID"].split(",")
dict_val.update(row)

打印(dict_val)

Just Pandas在这里帮不了你。(幸运的是,这里也不需要熊猫。)

import csv, io, ast, json
# Using a `StringIO` here instead of reading from a file
# (but since `StringIO`s are file-like, you can substitute
#  an `open()` call here.)
data = io.StringIO(
"""
SetrecordID|SetMatchScore|Pairs
100,101,102|90|"100-101,40","100-102,80","101-102,90"
103,104,105|80|"103-104,60","103-105,90","104-105,90"
106,107,108|65|"106-107,55","106-108,60","107-108,80"
109,110,111|95|"109-110,85","109-111,100","110-111,100"
""".strip()
)
rows = []
for row in csv.DictReader(data, delimiter="|", quoting=csv.QUOTE_NONE):
pairs = [pair.split(",", 1) for pair in ast.literal_eval(row["Pairs"])]
row["Pairs"] = [
{f"Pair{x}": key, "matchscore": int(val)}
for x, (key, val) in enumerate(pairs, 1)
]
row["SetrecordID"] = row["SetrecordID"].split(",")
rows.append(row)
with open("data.json", "w") as outf:
json.dump(rows, outf, indent=2)

将把该数据摄取到您可以使用的字典中(或者只是输出到JSON文件):

{'SetrecordID': ['100', '101', '102'], 'SetMatchScore': '90', 'Pairs': [{'Pair1': '100-101', 'matchscore': 40}, {'Pair2': '100-102', 'matchscore': 80}, {'Pair3': '101-102', 'matchscore': 90}]}
{'SetrecordID': ['103', '104', '105'], 'SetMatchScore': '80', 'Pairs': [{'Pair1': '103-104', 'matchscore': 60}, {'Pair2': '103-105', 'matchscore': 90}, {'Pair3': '104-105', 'matchscore': 90}]}
{'SetrecordID': ['106', '107', '108'], 'SetMatchScore': '65', 'Pairs': [{'Pair1': '106-107', 'matchscore': 55}, {'Pair2': '106-108', 'matchscore': 60}, {'Pair3': '107-108', 'matchscore': 80}]}
{'SetrecordID': ['109', '110', '111'], 'SetMatchScore': '95', 'Pairs': [{'Pair1': '109-110', 'matchscore': 85}, {'Pair2': '109-111', 'matchscore': 100}, {'Pair3': '110-111', 'matchscore': 100}]}