用Python解析文本文件?!txt单词的独特模式



我正在尝试解析文本文件中的一系列消息,并使用Python(2.7.3(或任何其他Python版本将它们保存为txt文件。

我有这样的txt文件.txt:

[#11:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
INFO isn't NULL
[#12:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#13:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
PERFECT isn't NULL
[#4:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
Time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0
[#15:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#16:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#17:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#8:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0
[#16:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#14:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#18:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#6:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
Time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0

这是txt具有的所有行的类型格式,因此每一行都在给定的txt文件上重复,并且它有自己的唯一模式,如我上面所示,其中关键字[INFO]

[PERFECT]我试图在python中实现一个函数,它逐行读取txt文件,那里的所有行都有我上面提到的这种类型的模式,并以这种特定类型转储所有行:

[#12:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]

到另一个txt文件。因此,如果我转到另一个txt文件,我会看到那里的所有行都有这种类型的消息:

[#12:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]

现在,在从给定的txt(输入txt(中嗅探这种类型的消息后,我需要逐行读取我生成的具有特定消息类型的新txt文件,然后获取加载索引值,并将它们转储到另一个只有加载索引值的txt文件中。

所以在我上面的例子中,我会得到这样的结果:

给定的txt文件:(这是.txt文件作为输入(

[#11:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
INFO isn't NULL
[#12:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#13:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
PERFECT isn't NULL
[#4:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
Time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0
[#15:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#16:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#17:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#8:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0
[#16:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#14:25][PERFECT][0x0015a] process returned as NULL load index[1] , length[20] , type[0]
[#18:3][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
[#6:23][INFO][0x0015a] it's here and it's optimally required start index[1] , length[15]
Time is here [Tick:135055] , Time:  17, index: 608, CastedType:20002, area :0

函数的结果/输出:

  1. 生成具有我上面解释的特定模式的所有行的txt文件(所有具有单词[PERFECT](,因此生成的txt文件应具有具有[PERFECT]:的所有消息/行

    [#12:25][PERFECT][00x0015a]进程返回为NULL加载索引[1],长度[20],类型[0][#16:25][PERFECT][00x0015a]进程返回为NULL加载索引[1],长度[20],类型[0][#14:25][PERFECT][00x0015a]进程返回为NULL加载索引[1],长度[20],类型[0]

  2. 然后为加载索引值生成另一个新的txt文件,在我的情况下,加载索引值在单词加载索引(加载索引[value](的[]内找到,因此该函数应在新的.txt文件中将加载索引的值作为列转储到另一个新建的txt文件中:

1 1 1

如何在python中解析包含上述模式和消息行的文本文件?

简单地说,我想用上面解释的消息模式在给定的txt文件上逐行(逐个消息(运行,然后将所有带有关键字[PERFECT]和Brackets的消息解析到新的txt文件中,这样我在新生成的txt文件中将只有带有关键字[PPERFECT]的消息。现在,在这个新生成的文件只嗅探了具有关键字[PERFECT]的消息之后,循环并传递这个新生成文件中的每个消息(具有具有唯一模式[PERFEECT]的嗅探消息(,以获得出现在每个消息中的负载索引[value]的值,在我的情况下,它是11,因为负载索引[1]在三条消息中显示为1。负载索引值应转储在另一个新的txt文件中,该文件的列为负载索引值。

非常感谢您的合作!

def get_statuses(s, t):
statuses = []
for line in s.splitlines():
if line.startswith("[#"):
meta, content = line.split(" ", 1)
time, status, code = meta.split("][")
time, code = time[2:], code[:-1]
index = re.search(r'(index[)(d+)(])', content).group(2)
if status == t:
statuses.append({
'time': time, 'code': code, 'content': content, 'index': index
})
return statuses

它将输出:

[{'time': '12:25',
'code': '0x0015a',
'content': 'process returned as NULL load index[1] , length[20] , type[0]',
'index': '1'},
{'time': '16:25',
'code': '0x0015a',
'content': 'process returned as NULL load index[1] , length[20] , type[0]',
'index': '1'},
{'time': '14:25',
'code': '0x0015a',
'content': 'process returned as NULL load index[1] , length[20] , type[0]',
'index': '1'}]

您可以使用csv.DictWriter()的函数输出。

相关内容

最新更新