我的csv-data-file是这样的:
"Date,""Time"",""Tags"",""Measurement"",""Info"",""GMT+01:00""";
"13.11.2022,""21:47:56"","""",""156"","""",""GMT+01:00""";
"29.05.2022,""09:00:00"","""",""Comment1,Comment2"","""",""GMT+01:00""";
该行以双引号开始,以双引号和分号结束。第一列没有引号,所有其他条目都有两个引号。
分隔符是逗号,但也可以用注释代替行中的值,这些值也用逗号分隔。有些列没有数据("")。
我如何在python pandas中读取这个文件?
我尝试了不同的代码,即:
df = pd.read_csv('test.csv', sep=',', lineterminator=';', quotechar='"')
(test.csv):
ParserError: Error tokenizing data. C error: Expected 6 fields in line 3, saw 7
或(real.csv)
ParserError: Error tokenizing data. C error: Expected 26 fields in line 147, saw 27
似乎两个引号之间的逗号也被认为是分隔符。
谢谢,问候sts85
import pandas as pd
with open('test.csv', 'r') as f:
data = [line[1:-3].replace('""', '"') + 'n' for line in f]
with open('test.csv', 'w') as f:
f.writelines(data)
df = pd.read_csv('test.csv')