Python:搜索文本文件并将包括前一行的行块写入另一个文件



我正在搜索一个文本文件,并希望复制和写入与其他文本文件中的匹配项相关联的行块。一旦我找到搜索条件,我想为每个匹配将前一行和下9行(总共10行(复制/写出到一个文件中。

要搜索的示例输入文件

Line 1: File sent to xyz blah blah:
Line 2: Search Criteria here
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10
Line 1: File sent to xyz blah blah:
Line 2: Search Criteria here
Line 3
Line 4
Line 5
Line 6
Line 7
Line 8
Line 9
Line 10

代码我已经启动:

searchList = []
searchStr = "Search Criteria here"
with open('', 'rt') as fInput:
previous = next(fInput)
for line in fInput:
if line.find(searchStr) != -1:
searchList.append(previous)
searchList.append(line.lstrip('n'))

with open('Output.txt','a') as fOutput:
OutPut.write("n".join(searchList))

上面的代码保存到这样的文件中,第一行和第二行之间有空格:

mm/dd/yyy  hh:mm:ss.MMM File sent to xyz:
Line 2: Search Criteria here
mm/dd/yyy  hh:mm:ss.MMM File sent to xyz:
Line 2: Search Criteria here

我想保存所有的10行,就像它们在输入文件中一样。

首先,读取文件并找到匹配的行号。记下行号,以便日后使用。

all_lines = []
match_lines = []
with open('in_file.txt', 'r') as fInput:
for number, line in enumerate(fInput):
all_lines.append(line)
if searchStr in line:
match_lines.append(number)

然后,在match_lines列表上循环,并从all_lines:输出您关心的行

num_lines_before = 1
num_lines_after = 10
with open('out_file.txt', 'w') as fOutput:
for line_number in match_lines:
# Get a slice containing the lines to write out
output_lines = all_lines[line_number-num_lines_before:line_number+num_lines_after+1]
fOutput.writelines(output_lines)    

为了测试这一点,我将创建一个io.StringIO对象,将字符串作为文件进行读/写,并要求在之前和之后各输入一行:

import io
strIn = """This is some text
12345
2 searchforthis
34567
45678
5 searchforthis
63r23tf
7pr9e2380
89spver894
949erc8m9
100948rm42"""
all_lines = []
match_lines = []
searchStr = "searchforthis"
# with open('in_file.txt', 'r') as fInput:
with io.StringIO(strIn) as fInput:
for number, line in enumerate(fInput):
all_lines.append(line)
if searchStr in line:
match_lines.append(number)
num_lines_before = 1
num_lines_after = 2

# with open('out_file.txt', 'w') as fOutput:
with io.StringIO("") as fOutput:
for line_number in match_lines:
# Get a slice containing the lines to write out
output_lines = all_lines[line_number-num_lines_before:line_number+num_lines_after+1]
fOutput.writelines(output_lines)    
fOutput.write("----------n") # Just to distinguish matches when we test

fOutput.seek(0)
print(fOutput.read())

给出以下输出:

12345
2 searchforthis
34567
45678
----------
45678
5 searchforthis
63r23tf
7pr9e2380
----------

最新更新