Regex for python request



嗨,我正在寻找一个解决方案来创建一个函数,该函数返回带有下一个结构的字典列表

示例

example_dict = {"host":"146.204.224.152", 
"user_name":"feest6811", 
"time":"21/Jun/2019:15:45:24 -0700",
"request":"POST /incentivize HTTP/1.1"}

数据如下:

146.204.224.152 - feest6811 [21/Jun/2019:15:45:24 -0700] "POST /incentivize HTTP/1.1" 302 4622
197.109.77.178 - kertzmann3129 [21/Jun/2019:15:45:25 -0700] "DELETE /virtual/solutions/target/web+services HTTP/2.0" 203 26554 
156.127.178.177 - okuneva5222 [21/Jun/2019:15:45:27 -0700] "DELETE /interactive/transparent/niches/revolutionize HTTP/1.1" 416 14701
*Keeps going more entries...*

我的功能如下:

import re
def logs():
with open("assets/logdata.txt", "r") as file:
logdata = file.read()
pattern="""
(?P<host>.[d.]*s?)         #host
(?P<user_name>[sw-]*s?)    #user_name
(?P<time>[w/:.[s-]*[]s])           #time
(?P<request>[w/"s.]*"?)     #request"""
group=[]
for item in re.finditer(pattern,logdata,re.VERBOSE):
group.append(item.groupdict())
return group    
raise NotImplementedError()

并重新发送类似的内容:

[{'host': '146.204.224.152 ',
'user_name': '- feest6811 ',
'time': '[21/Jun/2019:15:45:24 -0700]',
'request': ' "POST /incentivize HTTP/1.1" 302 4622n197.109.77.178 '},
{'host': '- ',
'user_name': 'kertzmann3129 ',
'time': '[21/Jun/2019:15:45:25 -0700]',
'request': ' "DELETE /virtual/solutions/target/web'},
{'host': '+',
'user_name': 'services',
'time': ' ',
'request': 'HTTP/2.0" 203 26554n156.127.178.177 '}]

为了解决此错误,我可以更改什么?

试试这个:

pattern="""
(?P<host>d{1,3}(?:.d{1,3}){3})s-s  #host (IPv4 only)
(?P<user_name>[sw-]*?)s?             #user_name
[(?P<time>[w/:.s-]*)]s?         #time
"(?P<request>.*?)"s?                   #request
(?P<code>d{3})s?                      #response code
(?P<bytes>d+)s?                       #bytes sent or received
"""

https://regex101.com/r/l7N52c/1

您可以尝试以下regex。

(?P<host>[d.]+)(?:s*-s*)(?P<user_name>w+)(?:s*[)(?P<time>.*?)(?:])(?:s*)(?P<request>".*?")

演示

最新更新