读取一个单元格中有多个值的CSV



我有这个csv:

Type,ID,Value1,Value2,Name,Text
TypeA,1231,"value1,value2,value3","value7,value8, value9",name1,
TypeA,2123,,,name2,textA
TypeA,4242,,,name3,
TypeA,5135,,,name4,
TypeA,2123,,,name5,
TypeA,7525,,,name6,
TypeA,6869,value4,,name7,
TypeB,9654,"value5, value6",,name8,textB
TypeB,3225,,,name9,
TypeB,6545,,value10,name10,

如果有多个值,我如何使它成为一个包含一些列表的字典?我试过了:

with open(csv_file,'r') as f:
csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]
(_, *header), *data = csv_list
print(csv_list)
csv_dict = {}
for row in data:
key, *values = row
if key not in csv_dict:
csv_dict[key] = []
csv_dict[key].append({key: value for key, value in zip(header, values)})

例如,我想让csv_dict['TypeB'][0]打印:

{'ID': '9654', 'Value1': ["value5, value6"], 'Value2': [], 'Name': 'name8', 'Text': 'textB'}

但是它打印出:

{'ID': '9654', 'Value1': '"value5', 'Value2': 'value6"', 'Name': '', 'Text': 'name8'}

使用csv.DictReader读取文件,而不是手动以逗号分隔行。csv.DictReader负责用引号转义的逗号。

with open(csv_file, 'r') as f:
reader = csv.DictReader(f)
for data in reader:
print(data)

为文件中的每一行创建字典,引号中的字段作为单个字符串读取,如下所示:

{'Type': 'TypeA', 'ID': '1231', 'Value1': 'value1,value2,value3', 'Value2': 'value7,value8, value9', 'Name': 'name1', 'Text': ''}

现在,由于您希望Value1项是一个列表,如果该值包含逗号,则可以用逗号分隔它。

csv_dict = {}
with open(csv_file, 'r') as f:
reader = csv.DictReader(f)
for data in reader:
# Overwrite with split result if data["Value1"] is not an empty string
# Else, make an empty list
data["Value1"] = data["Value1"].split(",") if data["Value1"] else []
data["Value2"] = data["Value2"].split(",") if data["Value2"] else []
if data["Type"] not in csv_dict:
csv_dict[data["Type"]] = [data]
else:
csv_dict[data["Type"]].append(data)

现在,csv_dict["TypeB"][0]是:

{'Type': 'TypeB',
'ID': '9654',
'Value1': ['value5', ' value6'],
'Value2': [],
'Name': 'name8',
'Text': 'textB'}

尝试:

import csv
with open("your_file.csv", "r") as f_in:
reader = csv.DictReader(f_in)
data = list(reader)
for row in data:
row["Value1"] = [
ss for s in row["Value1"].split(",") if (ss := s.strip())
]
row["Value2"] = [
ss for s in row["Value2"].split(",") if (ss := s.strip())
]

print(data)

打印:

[
{
"Type": "TypeA",
"ID": "1231",
"Value1": ["value1", "value2", "value3"],
"Value2": ["value7", "value8", "value9"],
"Name": "name1",
"Text": "",
},
{
"Type": "TypeA",
"ID": "2123",
"Value1": [],
"Value2": [],
"Name": "name2",
"Text": "textA",
},
{
"Type": "TypeA",
"ID": "4242",
"Value1": [],
"Value2": [],
"Name": "name3",
"Text": "",
},
...and so on.

EDIT: Without:=operator:

import csv
with open("your_file.csv", "r") as f_in:
reader = csv.DictReader(f_in)
data = list(reader)
for row in data:
row["Value1"] = [
s for s in map(str.strip, row["Value1"].split(",")) if s
]
row["Value2"] = [
s for s in map(str.strip, row["Value2"].split(",")) if s
]

print(data)

最新更新