从java日志字符串中提取KEY=VALUE对的正则表达式



我有一个这样的日志字符串:

String s0 = "DC696,"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getRortList.dwr","2222-11-10 08:32:22,351               PLV=REQ CIP=9.9.9.7 CMID=syairp CMN=""Dub Airport Corporation Limited"" SN=sfv4_APM180885. DPN=dbPool66HFT01 UID=3862D04108 UN=91F6025D47F01D IUID=1931 LOC=en_GB EID=""EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080"" AGN=""[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]"" RID=REQ-[7274545]  MTD=POST URL=""/xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr"" RQT=2835 MID=ADIN PID=ADMIN PQ=ADIN_PAGE SUB=0 MEM=2331036 CPU=2410 UCPU=2300 SCPU=110 FRE=10 FWR=0 NRE=2281 NWR=218 SQLC=43 SQLT=142 RPS=200 SID=60826A3FAB005A8A9B930177C5******.pc6bc1029 GID=e262dde6d0e040070b58afd4c8 HSID=ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61 CSL=CRITICAL CCON=0 CSUP=0 CLOC=0 CEXT=0 CREM=0 STK={""n"":""/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr"",""i"":1,""t"":2835,""slft"":2679,""sub"":[{""n"":""SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST"",""i"":17,""t"":40,""slft"":40,""st"":337,""m"":220958,""nr"":154,""rt"":0,""rn"":22,""fs"":0}]}   ","2022-11-09T21:32:22.351+0000",p66cf1029,"dc606_ss_application",1,"/app/tomcat/logs/pef.log","perf_log_yxx",swsskix13";

我想提取KEY=VALUE对,如{PLV=REQ, CIP=9.9.9.7,CMN="Dub Airport Corporation Limited", STK={...} }。变成Map<String,String>

我尝试了这个,但不工作

String[] str1= str.split("\s(?=(([^"]*"))*[^"]*$)\s*");
System.out.println("Value of split string is "+ Arrays.toString(str1));

任何意见都将大有帮助。

您可以使用以下解决方案:

String s0 = "DC696,"/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getRortList.dwr","2222-11-10 08:32:22,351               PLV=REQ CIP=9.9.9.7 CMID=syairp CMN=""Dub Airport Corporation Limited"" SN=sfv4_APM180885. DPN=dbPool66HFT01 UID=3862D04108 UN=91F6025D47F01D IUID=1931 LOC=en_GB EID=""EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080"" AGN=""[Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]"" RID=REQ-[7274545]  MTD=POST URL=""/xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr"" RQT=2835 MID=ADIN PID=ADMIN PQ=ADIN_PAGE SUB=0 MEM=2331036 CPU=2410 UCPU=2300 SCPU=110 FRE=10 FWR=0 NRE=2281 NWR=218 SQLC=43 SQLT=142 RPS=200 SID=60826A3FAB005A8A9B930177C5******.pc6bc1029 GID=e262dde6d0e040070b58afd4c8 HSID=ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61 CSL=CRITICAL CCON=0 CSUP=0 CLOC=0 CEXT=0 CREM=0 STK={""n"":""/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr"",""i"":1,""t"":2835,""slft"":2679,""sub"":[{""n"":""SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST"",""i"":17,""t"":40,""slft"":40,""st"":337,""m"":220958,""nr"":154,""rt"":0,""rn"":22,""fs"":0}]}   ","2022-11-09T21:32:22.351+0000",p66cf1029,"dc606_ss_application",1,"/app/tomcat/logs/pef.log","perf_log_yxx",swsskix13";
String regex = "(\w+)=((?=\{)(?:(?=.*?\{(?!.*?\3)(.*\}(?!.*\4).*))(?=.*?\}(?!.*?\4)(.*)).)+?.*?(?=\3)[^{]*(?=\4$)|"{2}(.*?)"{2}|(\S+))";
Pattern p = Pattern.compile(regex);
Matcher m = p.matcher(s0);
Map<String, String> res = new HashMap<String, String>();
while(m.find()) {
String val = m.group(2);
if (m.group(5) != null) {
val = m.group(5);
}
if (m.group(6) != null) {
val = m.group(6);
}
res.put(m.group(1), val);
System.out.println(m.group(1) + " => " + val + "n----");
}

输出:

PLV => REQ
----
CIP => 9.9.9.7
----
CMID => syairp
----
CMN => Dub Airport Corporation Limited
----
SN => sfv4_APM180885.
----
DPN => dbPool66HFT01
----
UID => 3862D04108
----
UN => 91F6025D47F01D
----
IUID => 1931
----
LOC => en_GB
----
EID => EVENT-UNKNOWN-UNKNOWN-ob55abe0118-201110083217-396080
----
AGN => [Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36 Edg/107.0.1418.35]
----
RID => REQ-[7274545]
----
MTD => POST
----
URL => /xi/ajax/remoting/call/plaincall/adhocRrtBuilderCoollerProxy.getRtList.dwr
----
RQT => 2835
----
MID => ADIN
----
PID => ADMIN
----
PQ => ADIN_PAGE
----
SUB => 0
----
MEM => 2331036
----
CPU => 2410
----
UCPU => 2300
----
SCPU => 110
----
FRE => 10
----
FWR => 0
----
NRE => 2281
----
NWR => 218
----
SQLC => 43
----
SQLT => 142
----
RPS => 200
----
SID => 60826A3FAB005A8A9B930177C5******.pc6bc1029
----
GID => e262dde6d0e040070b58afd4c8
----
HSID => ddc665538db779508d3213c0bb63bcb1c49fe8236d5f0884ae975915728e61
----
CSL => CRITICAL
----
CCON => 0
----
CSUP => 0
----
CLOC => 0
----
CEXT => 0
----
CREM => 0
----
STK => {""n"":""/xi/ajax/remoting/call/plaincall/adhocReportBuilderControllerProxy.getrtList.dwr"",""i"":1,""t"":2835,""slft"":2679,""sub"":[{""n"":""SQL:select * from sfv4_HOUA180885.REPORT_DEF WHERE REPORT_DEF_ID IN (SELECT REPORT_DEF_ID FROM sfv4_HA80885.REPORT_DTASET WHERE REPORT_ID=?) AND DELETED=? ORDER BY REPORT_DEF_ID asc NULLS LAST"",""i"":17,""t"":40,""slft"":40,""st"":337,""m"":220958,""nr"":154,""rt"":0,""rn"":22,""fs"":0}]}
----

参见regex演示。

Regex细节:

  • (w+)-组1:一个或多个字字符
  • =- a=char
  • ((?={)(?:(?=.*?{(?!.*?3)(.*}(?!.*4).*))(?=.*?}(?!.*?4)(.*)).)+?.*?(?=3)[^{]*(?=4$)|"{2}(.*?)"{2}|(S+))-第二组:
    • (?={)(?:(?=.*?{(?!.*?3)(.*}(?!.*4).*))(?=.*?}(?!.*?4)(.*)).)+?.*?(?=3)[^{]*(?=4$)-两个成对花括号之间的子字符串(从是否有可能匹配嵌套括号与正则表达式不使用递归或平衡组?)
    • |-或
    • "{2}(.*?)"{2}-两个"s,然后是除换行符以外的任何零或更多字符,尽可能少(捕获到第5组),然后是两个"s
    • |-或
    • (S+)-一个或多个非空白字符(捕获到第6组)

相关内容

最新更新