从 Python 中的 txt 文件导入单个列,避免标题

我是Python的新手,请为我的初学者问题道歉。 我有一个 txt 文件,如图所示,我想导入三列(第 1 列 2nd 第 6 列(并将数据存储在三个不同的向量中,避免所有标头。



Name: Z1836_Tb10-TbCoTb_DL_MzDown_FS_Phi-90∞_I-665uA_Offs-0uA_Avg-5s_(0)_S1.dat
Date: Samstag, 1. Juni 2019 - new scaling
Scan Type: Field Scan
Angle [∞]: 90
Current [uA]: 665
Frequency [Hz]: 10
Offset [uA]: 0
Sampling Rate [Hz]: 204800
Averaging Duration [s]: 5
Measurement Duration: 00:30:44
Field   R+(1f) Real    R+(1f) Img R+(1f) Mag R+(1f) Phase   R+(2f) Real    R+(2f) Img R+(2f) Mag R+(2f) Phase   Field Set
mT  g(W)   g(W)   g(W)   ∞   g(W)   g(W)   g(W)   ∞   A
1019.14 -0.135007229    0.015354704 -0.135877588    173.51149082    -2.776103401E-6 -2.996982259E-6 -4.085174752E-6 -132.808926217     20
1000.95 -0.134959525    0.015398131 -0.135835105    173.491016631   -1.41565391E-5  4.583223348E-6  -1.487997096E-5 162.060479267    19.6
982.67  -0.134951253    0.015396305 -0.13582668 173.491386228   1.196964996E-5  -3.522605161E-6 1.247722995E-5  -16.398879321    19.2
964.17  -0.134935857    0.015381909 -0.135809751    173.496684402   5.150366012E-7  -2.854084284E-5 2.854548953E-5  -88.966175556    18.8
945.27  -0.134941957    0.015372557 -0.135814754    173.500895888   -7.177408364E-6 -2.703168168E-5 -2.796832146E-5 -104.869975658   18.4
926.12  -0.134916606    0.01535581  -0.13578767 173.506706039   -1.599523307E-5 1.704176103E-5  -2.337240039E-5 133.18562213       18
906.81  -0.134895719    0.015356654 -0.135767013    173.505355181   -7.367897986E-6 2.807593732E-6  -7.884700584E-6 159.14027614     17.6
887.36  -0.134877099    0.015409203 -0.135754468    173.482430298   -1.011942317E-5 -1.588290362E-5 -1.883266717E-5 -122.502303207   17.2

TXT 文件


col1 col2 col3 col4
1 2 3 4 
1 2 3 4 
1 2 3 4   
1 2 3 4  
1 2 3 4 
1 2 3 4


import pandas as pd
#Set you skiprows according to your text file
df = pd.read_csv('sample.txt', delim_whitespace=True, skiprows=5)
vector_col_2 = list(df.iloc[:,1])
vector_col_4 = list(df.iloc[:,3])
print('V2: ',vector_col_2)
print('V4: ',vector_col_4)


V2:  [2, 2, 2, 2, 2, 2]
V4:  [4, 4, 4, 4, 4, 4]


import re
with open("file.txt", "r") as f:
lines = f.readlines()
data = []
for line in lines:
if not re.search(r'[A-DF-Za-df-z]', line): #Don't allow any letter except E or e
if re.search(r'd', line): # It has at least to have a line with a number
data = list(zip(*data))


('1019.14', '1000.95', '982.67', '964.17', '945.27', '926.12', '906.81', '887.36')
('-0.135007229', '-0.134959525', '-0.134951253', '-0.134935857', '-0.134941957', '-0.134916606', '-0.134895719', '-0.134877099')
('-2.776103401E-6', '-1.41565391E-5', '1.196964996E-5', '5.150366012E-7', '-7.177408364E-6', '-1.599523307E-5', '-7.367897986E-6', '-1.011942317E-5')


我们将逐行进行。如果该行包含任何不等于E/e的字母,我们将跳过该行。那就是忽略标头。 (由于科学记数法,允许使用E(


如果该行除了 E 之外没有任何字母,并且至少有一个数字,我们将其放入拆分data



  1. 如果任何列中包含文本,则此方法将不起作用。它适用于您的情况,因为您只有数字。

  2. 如果标题中的一行除了E和至少一个数字之外没有字母,它也会失败。

  3. 这是一种灵活的方法,因此您不需要预先了解文件标题的行数。但你必须知道 1.和 2.约束。如果您不能保证 1.和 2.约束 您可以使用跳过行来实现相同的效果。像这样更改for-loop

#This might change according to your need, in your example it's 13.
skip_lines = 13
for line in lines[skip_lines:]:



# Open the related file.
with open("ci/common/python_utils/test_text.txt", "r") as opened_file:
# Read the all lines of file. It returns a list type object.
lines = opened_file.readlines()
# Cut the unrelated lines (the header).
related_content = lines[13:]
# Init the vectors (lists).
col1, col2, col6 = [], [], []
for row in related_content:
# You should use the expected column -1 for the list indexing.
# Print the content of columns
print("First column content: {}".format(col1))
print("Second column content: {}".format(col2))
print("Sixth column content: {}".format(col6))

使用的 txt 文件:

>>> python ci/common/python_utils/test_file.py 
First column content: ['1019.14', '1000.95', '982.67', '964.17', '945.27', '926.12', '906.81', '887.36']
Second column content: ['-0.135007229', '-0.134959525', '-0.134951253', '-0.134935857', '-0.134941957', '-0.134916606', '-0.134895719', '-0.134877099']
Sixth column content: ['-2.776103401E-6', '-1.41565391E-5', '1.196964996E-5', '5.150366012E-7', '-7.177408364E-6', '-1.599523307E-5', '-7.367897986E-6', '-1.011942317E-5']
