我有这个csv:
Type,ID,Value1,Value2,Name,Text
TypeA,1231,"value1,value2,value3","value7,value8, value9",name1,
TypeA,2123,,,name2,textA
TypeA,4242,,,name3,
TypeA,5135,,,name4,
TypeA,2123,,,name5,
TypeA,7525,,,name6,
TypeA,6869,value4,,name7,
TypeB,9654,"value5, value6",,name8,textB
TypeB,3225,,,name9,
TypeB,6545,,value10,name10,
如果有多个值,我如何使它成为一个包含一些列表的字典?我试过了:
with open(csv_file,'r') as f:
csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]
(_, *header), *data = csv_list
print(csv_list)
csv_dict = {}
for row in data:
key, *values = row
if key not in csv_dict:
csv_dict[key] = []
csv_dict[key].append({key: value for key, value in zip(header, values)})
例如,我想让csv_dict['TypeB'][0]
打印:
{'ID': '9654', 'Value1': ["value5, value6"], 'Value2': [], 'Name': 'name8', 'Text': 'textB'}
但是它打印出:
{'ID': '9654', 'Value1': '"value5', 'Value2': 'value6"', 'Name': '', 'Text': 'name8'}
使用csv.DictReader
读取文件,而不是手动以逗号分隔行。csv.DictReader
负责用引号转义的逗号。
with open(csv_file, 'r') as f:
reader = csv.DictReader(f)
for data in reader:
print(data)
为文件中的每一行创建字典,引号中的字段作为单个字符串读取,如下所示:
{'Type': 'TypeA', 'ID': '1231', 'Value1': 'value1,value2,value3', 'Value2': 'value7,value8, value9', 'Name': 'name1', 'Text': ''}
现在,由于您希望Value1
项是一个列表,如果该值包含逗号,则可以用逗号分隔它。
csv_dict = {}
with open(csv_file, 'r') as f:
reader = csv.DictReader(f)
for data in reader:
# Overwrite with split result if data["Value1"] is not an empty string
# Else, make an empty list
data["Value1"] = data["Value1"].split(",") if data["Value1"] else []
data["Value2"] = data["Value2"].split(",") if data["Value2"] else []
if data["Type"] not in csv_dict:
csv_dict[data["Type"]] = [data]
else:
csv_dict[data["Type"]].append(data)
现在,csv_dict["TypeB"][0]
是:
{'Type': 'TypeB',
'ID': '9654',
'Value1': ['value5', ' value6'],
'Value2': [],
'Name': 'name8',
'Text': 'textB'}
尝试:
import csv
with open("your_file.csv", "r") as f_in:
reader = csv.DictReader(f_in)
data = list(reader)
for row in data:
row["Value1"] = [
ss for s in row["Value1"].split(",") if (ss := s.strip())
]
row["Value2"] = [
ss for s in row["Value2"].split(",") if (ss := s.strip())
]
print(data)
打印:
[
{
"Type": "TypeA",
"ID": "1231",
"Value1": ["value1", "value2", "value3"],
"Value2": ["value7", "value8", "value9"],
"Name": "name1",
"Text": "",
},
{
"Type": "TypeA",
"ID": "2123",
"Value1": [],
"Value2": [],
"Name": "name2",
"Text": "textA",
},
{
"Type": "TypeA",
"ID": "4242",
"Value1": [],
"Value2": [],
"Name": "name3",
"Text": "",
},
...and so on.
EDIT: Without:=
operator:
import csv
with open("your_file.csv", "r") as f_in:
reader = csv.DictReader(f_in)
data = list(reader)
for row in data:
row["Value1"] = [
s for s in map(str.strip, row["Value1"].split(",")) if s
]
row["Value2"] = [
s for s in map(str.strip, row["Value2"].split(",")) if s
]
print(data)