Config file, logstash ruby filter event.get( "message" ).match() Error



在logstash配置文件中,我试图只获取要解析的XML数据。

这是我的配置文件:


input {
file {
path => "/home/elastic-stack/logstash-7.3.2/event-data/telmetry.log"
start_position => "beginning"
type => "sandbox-out"
codec => multiline {
pattern => "^</datastore-contents-xml>"
negate => "true"
what => "next"
}
}
http { 
host => "127.0.0.1"
port => 8080
type => "sandbox-out"
}
}
filter {
grok {
match => { "message" => "[%{USER:host_name} %{IP:ip_address} %{USER:session-id} %{NUMBER:session-id-num}]"}
}
grok {
match => { "message" => "Subscription Id     : %{BASE16NUM:subcription-id:int}"}
}    
grok {
match => { "message" => "Event time      : %{TIMESTAMP_ISO8601:event-time}"}
}
grok {
match => {"message" => "<%{USERNAME:Statistic}>"}
}
mutate {
remove_field => ["headers", "host_name", "session-id","message"]
}
date {
match => ["timestamp","dd/MMM/yyyy:HH:mm:ss Z"]
}
ruby { code => 'event.set("justXml", event.get("message").match(/.+(<datastore-contents-xml.*)/m)[1])' }
xml {
#remove_namespaces => "true"
#not even the namspace option is working to access the http link
source => "justXml"
target => "xml-content"
#force_array => "false"
xpath => [
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='name']/text()" , "name" ,
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='total-memory']/text()" , "total-memory",
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='used-memory']/text()" , "used-memory",
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='free-memory']/text()" , "free-memory" ,
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='lowest-memory']/text()" , "lowest-memory" ,
"//*[name()='datastore-contents-xml']/*[name()='memory-statistics']/*[name()='memory-statistic'][1]/*[name()='highest-memory']/text()" , "highest-memory" 
]
#logstash is not dectecting any of these xpaths in the config  
}
mutate {
convert => {
"total-memory" => "integer"
"used-memory" => "integer"
"free-memory" => "integer"
"lowest-memory" => "integer"
"highest-memory" => "integer"
}
}

}
output {
stdout {
codec => rubydebug
}
file {
path => "%{type}_%{+dd_MM_yyyy}.log"
}
}

期望输出:

{
"ip_address" => "10.10.20.30",
"subcription-id" => 2147483650,
"event-time" => "2019-09-12 13:13:30.290000+00:00",
"host" => "127.0.0.1",
"Statistic" => "memory-statistic",
"type" => "sandbox-out",
"@version" => "1",
"@timestamp" => 2019-09-26T10:03:00.620Z,
"session-id-num" => "35"
"yang-model" => "http://cisco.com/ns/yang/Cisco-IOS-XE-memory-oper"
"name" => "Processor"
"total-memory" => 2238677360
"used-memory" => 340449924
"free-memory" => 1898227436
"lowest-usage" => 1897220640
"highest-usage" => 1264110388
}

错误:

[2019-09-27T09:18:55,622][ERROR][logstash.filters.ruby    ] Ruby exception occurred: undefined method `match' for nil:NilClass
/home/elastic-stack/logstash-7.3.2/vendor/bundle/jruby/2.5.0/gems/awesome_print-1.7.0/lib/awesome_print/formatters/base_formatter.rb:31: warning: constant ::Fixnum is deprecated
{
"ip_address" => "10.10.20.30",
"subcription-id" => 2147483650,
"session-id-num" => "35",
"tags" => [
[0] "_rubyexception"
],
"Statistic" => "memory-statistic",
"event-time" => "2019-09-12 13:13:30.290000+00:00",
"type" => "sandbox-out",
"@version" => "1",
"host" => "127.0.0.1",
"@timestamp" => 2019-09-27T07:18:54.868Z

通过错误,我已经可以知道问题出在红宝石过滤器上,但我不知道如何解决它。

此数据由思科遥测生成,我正在尝试使用 Elastic Stack 摄取它。

错误似乎是该事件没有message字段,因此您无法对不存在的事物调用match。 我看到您正在此 ruby 代码中的message字段上调用match

ruby { code => 'event.set("justXml", event.get("message").match(/.+(<datastore-contents-xml.*)/m)[1])' }

但是,您要在事件中删除message字段,之前几行:

mutate {
remove_field => ["headers", "host_name", "session-id","message"]
}

解决方案是仅在您不再需要消息字段时才删除它,我会将remove_field突变移动到filter块的末尾。

如果我可以补充一点,还有一个建议。您在同一消息字段上运行多个 grok 过滤器:

grok {
match => { "message" => "[%{USER:host_name} %{IP:ip_address} %{USER:session-id} %{NUMBER:session-id-num}]"}
}
grok {
match => { "message" => "Subscription Id     : %{BASE16NUM:subcription-id:int}"}
}    
grok {
match => { "message" => "Event time      : %{TIMESTAMP_ISO8601:event-time}"}
}
grok {
match => {"message" => "<%{USERNAME:Statistic}>"}
}

这可以简化为这样(您可以查看 Grok 过滤器文档:

grok {
break_on_match => false,
match => {
"message" => [
"[%{USER:host_name} %{IP:ip_address} %{USER:session-id} %{NUMBER:session-id-num}]",
"Subscription Id     : %{BASE16NUM:subcription-id:int}",
"Event time      : %{TIMESTAMP_ISO8601:event-time}",
"<%{USERNAME:Statistic}>"
]
}
}

这样,您只需要 grok 过滤器的一个实例,因为它将遍历列表中的模式,并且由于break_on_match=>false它不会在第一次成功匹配后完成,但会确保根据列表中的所有模式提取所有字段。

相关内容

最新更新