Python :无法正确提取 csv 文件



我必须在python工作与csv文件。他长这样:

61979.521351 1 41 -91 205061979.521351 2 -10 -8 461979.526329 1 42 -96 207061979.526329 2 -17 -6 461979.531307 1 44 -88 207061979.531307 2 -12 -861979.536285 1 44 -101 207461979.536285 2 -13 -7 861979.541263 1 47 -99 2050

我不知道如何从csv文件中提取数据,当数据没有逗号分隔,当我们没有任何头。有人能帮我吗?

尝试使用:

指定分隔符和标头参数
import pandas as pd
data = pd.read_csv('path_to_file', delimiter = ' ', header = None, names = colnames)

这里,colnames是一个列表,包含您想要的列名

一般情况:

d = ' '
with open('data.dat', 'r') as f:
data = [x.split(d) for x in f.read().splitlines()]

你得到一个嵌套的列表,在顶层有行,在底层有字段。

您可以这样使用csv模块:

import csv
with open('test.csv', newline='') as csvfile:
rdr = csv.reader(csvfile, delimiter=' ')
for row in rdr:
# Remove or comment out the following line to keep each row element as a string:
row = [float(elem) for elem in row]
print(row)

打印:

[61979.521351, 1.0, 41.0, -91.0, 2050.0, 61979.521351, 2.0, -10.0, -8.0, 4.0, 61979.526329, 1.0, 42.0, -96.0, 2070.0]
[61979.526329, 2.0, -17.0, -6.0, 4.0, 61979.531307, 1.0, 44.0, -88.0, 2070.0, 61979.531307, 2.0, -12.0, -8.0, 3.0]
[61979.536285, 1.0, 44.0, -101.0, 2074.0, 61979.536285, 2.0, -13.0, -7.0, 8.0, 61979.541263, 1.0, 47.0, -99.0, 2050.0]

如果您希望将所有数据以行形式保存在一个列表中,并且行中的每个元素都可以通过列名进行索引,则:

import csv
column_names = list('ABCDEFGIHJKLMNO') # 'A', 'B', ... 'O'
data = []
with open('test.csv', newline='') as csvfile:
rdr = csv.DictReader(csvfile, fieldnames=column_names, delimiter=' ')
data = [{k: float(v) for k, v in row.items()} for row in rdr]
# Or: data = [row for row in rdr] # to keep everything as strings
print(data[2]['F'])

打印:

61979.536285

以上提供了pandas的轻量级替代方案。