提取日志数据中嵌入的字段



我想从Splunk中的字符串中提取一个字段。下面是一个示例数据,我想从中提取vin字段。

  {"timestamp":"2147483647","time":"2019-07-12T07:12:30Z","source_type":"APP/PROC/WEB","source_instance":"3","origin":"rep","msg":"2019-07-12 07:12:30.840  INFO 15 --- [ XNIO-2 task-95] f.c.g.m.c.m.r.r.GetCurrentLiteController : {"transaction_summary":{"vin":"3FA6P0LU8JR126702","service":"moduleinfo","api_call":"getcurrentlite","requesting_system":"CVFMA","start_time":"1562915550829","end_time":"1562915550840","response_time":"11","http_response_code":"200","app_status_code":"200","trace_id":"62b2e776-fd02-44c1-8f49-01930fc667db","userid":"GVMS","x_b3_traceid":null,"x_b3_spanid":null,"x_span_export":"true"}}","message_type":"OUT","level":"info","job_index":"0bfbe359-fe76-43e0-9a19-cea5dfd80856","job":"diego_cell","ip":"10.68.80.94","event_type":"LogMessage","cf_space_name":"Ford-GVMS_ECC_PROD","cf_org_name":"Ford-GVMS_FMCC_PROD_ECC_Prod","cf_app_name":"gvms-moduleinfo-api"}

正确的方法是什么?

有关确切的详细信息,请参阅您的 Splunk 版本的帮助文档。我想你想要这样的东西:

| rex "vin\":\"<(?<vin>[^"]+)"

确切的解决方案将取决于实际逃脱的内容。如果数据确实具有所有转义的引号作为您的示例,那么我上面的内容可能会起作用。当我做这样的事情时,我通常会从一个正则表达式开始,它捕获的比我想要的更多,然后将比赛向前移动一点,然后确定停止点。例如,您可以从

| rex "vin<(?<bigvin>.*)>

看到你真的走在正确的轨道上。然后开始在捕获部分之前添加字符,直到捕获以3F开头的内容(在您的示例中(。然后,您应该能够将.*替换为["]+并获得所需的内容。

最新更新