我正在尝试从文件中读取选择性信息。文件结构如下:
Component1:
Detail1
Detail2
Detail3
Component2:
Detail1
Detail2
Detail3
Component3:
Detail1
Detail2
Detail3
Component4:
Detail1
Detail2
Detail3
文件有有限的行数,我正在将该文件读取到行列表中。
with open('/tmp/filename.txt', 'r') as openf:
for line_no, line in enumerate(openf):
file_lines_list.append(line)
我想有选择地读取组件 2 的信息
所以我写了以下代码。
with open('/tmp/filename.txt', 'r') as f:
for line_no, line in enumerate(f):
if "Component2" in line:
x = line_no
print(x)
for item in file_lines_list[x:]:
if item != "n":
tmp_file.write(item)
else:
break
但它是打印行直到列表(文件行(的末尾。它不会在第一次出现换行符时中断,理想情况下应该是在组件 3 之前。(组件详细信息之间没有换行符(有人可以指出我在这里做错了什么吗?
使用 str.startswith()
和布尔标志:
列表.txt:
Component1:
C1_Detail1
C1_Detail2
C1_Detail3
Component2:
C2_Detail1
C2_Detail2
C2_Detail3
Component3:
C3_Detail1
C3_Detail2
C3_Detail3
Component4:
C4_Detail1
C4_Detail2
C4_Detail3
因此:
with open('list.txt', 'r') as f:
content = f.readlines()
# you may also want to remove empty lines
content = [l.strip() for l in content if l.strip()]
bFlag = False
for line in content:
if line.startswith('Component2'):
bFlag = not bFlag
if bFlag:
if 'Component3' in line:
break
else:
print(line)
输出:
Component2:
C2_Detail1
C2_Detail2
C2_Detail3
如果您将文本作为整个文本加载(使用 read
(而不是作为行列表(使用 readlines
(,则此任务更简单。我会按照以下方式做:
with open('input_file.txt','r') as openf:
data = openf.read()
components = data.split('nn')
components = [i for i in components if i.startswith('Component2')]
print(len(components)) #prints 1 as expected
with open('out_file.txt','w') as f:
f.write(components[0])
我假设正好有 1 个组件满满条件。此解决方案使描述的任务完成,但是如果您无论如何都需要该行列表,则可能不是最佳选择,因此请随意选择最适合您的用例需求的解决方案。
with open('file') as file:
# remove empty lines
lines = [line for line in file.readlines() if line]
# holds all our components
components = {}
# holds the last component
comp_name = None
for line in lines:
if not line.startswith(' '):
# remove : from the end for easy reference
comp_name = line[:-1]
# add new Component our map
components[comp_name] = []
else:
# add detail to component that already exists
components[comp_name].append(line.strip())
# now we just find our component
print(components['Component2'])
这将打印:
['Detail1', 'Detail2', 'Detail3']