我得到了一个很大的文本文件,其中包含来自光谱的数据。
前几行是这样的:
397.451 -48.38
397.585 -48.38
397.719 -48.38
397.853 -18.38
397.987 -3.38
398.121 6.62
398.256 -0.38
398.39 -1.38
398.524 7.62
398.658 4.62
398.792 -4.38
398.926 12.62
399.06 5.62
399.194 -6.38
399.328 -6.38
399.463 0.6
399.597 -6.38
399.731 -12.38
399.865 1.62
399.999 2.62
我想做的是创建两个列表,其中一个包含例如[397.451397.585397.719……]
和另一个[48.38,-48.38,-48.35,-18.38,-3.38…etc]
坚持基本原则:
fil = open("big_text_file.txt")
list1 = []
list2 = []
text = fil.readline()
while text:
try:
nums = text.split()
list1.append(float(nums[0]))
list2.append(float(nums[1]))
except:
pass
text = fil.readline()
print(list1)
print(list2)
说明:
- 创建两个列表
- 正如您所说,这是一个很大的文本文件(因此逐行阅读)
- 分割在空间上读取的行"(
split
中默认为"单个空格") - 如果以上失败,则表示空行。(这就是
try
和except
的作用) - 更新两个列表(如果没有错误)
- 读下一行
输出:
[397.451, 397.585, 397.719, 397.853, 397.987, 398.121, 398.256, 398.39, 398.524, 398.658, 398.792, 398.926, 399.06, 399.194, 399.328, 399.463, 399.597, 399.731, 399.865, 399.999]
[-48.38, -48.38, -48.38, -18.38, -3.38, 6.62, -0.38, -1.38, 7.62, 4.62, -4.38, 12.62, 5.62, -6.38, -6.38, 0.62, -6.38, -12.38, 1.62, 2.62]
使用csv库:https://docs.python.org/3/library/csv.html
解决方案:
import csv
with open("spectroscopy.txt", newline="") as csvfile:
reader = csv.reader(csvfile, delimiter=" ")
column_A = []
column_B = []
for row in reader:
try:
column_A.append(float(row[0]))
column_B.append(float(row[1]))
except ValueError:
pass
大熊猫的替代品:
import pandas as pd
data = pd.read_csv("spectroscopy.txt", sep=" ", header=None, index_col=0)
spect_list = []
spect_list_a =[]
spect_list_b =[]
with open('spect.txt') as f:
for i in f.readlines(): #read entire file as lines
i = (i.rstrip('n')) #remove newlin character
if i: #discard blank lines
spect_list.append(i)
spect_list_a.append(i.split()[0])
spect_list_b.append(i.split()[1])
print(spect_list)
print(spect_list_a)
print(spect_list_b)
您得到的python列表中的元素为"element"(带引号),不确定是否是正确的答案
明白了:
使用
spect_list_a.append(float(i.split()[0]))
spect_list_b.append(float(i.split()[1]))
使用换位技巧和参数将列自动转换为浮点。此外,skipinitialspace
处理值之间有两个空格的几行。
import csv
# The quoting value auto-converts numeric columns to float.
with open('input.csv',newline='') as f:
r = csv.reader(f,delimiter=' ',quoting=csv.QUOTE_NONNUMERIC,skipinitialspace=True)
data = list(r)
# transpose row/col data and convert to list (otherwise, it would be tuple)
col1,col2 = [list(col) for col in zip(*data)]
print(col1)
print(col2)
[397.451, 397.585, 397.719, 397.853, 397.987, 398.121, 398.256, 398.39, 398.524, 398.658, 398.792, 398.926, 399.06, 399.194, 399.328, 399.463, 399.597, 399.731, 399.865, 399.999]
[-48.38, -48.38, -48.38, -18.38, -3.38, 6.62, -0.38, -1.38, 7.62, 4.62, -4.38, 12.62, 5.62, -6.38, -6.38, 0.62, -6.38, -12.38, 1.62, 2.62]
使用pandas
:
import pandas as pd
data = pd.read_csv('input.csv',sep=' ',skipinitialspace=True,header=None)
col1 = list(data[0])
col2 = list(data[1])
print(col1)
print(col2)
不使用进口:
with open('input.csv') as f:
data = [[float(n) for n in row.split()] for row in f]
col1,col2 = [list(n) for n in zip(*data)]
print(col1)
print(col2)