grok:我在哪里定义自定义的grok标签



我需要使用Logstash解析nginx日志,我发现了这个问题:

logstash 的Nginx grok模式

我想尝试问题中的模式,所以我创建了包含以下内容的配置文件:

input {
file {
path => "/var/log/nginx/access.log"
type => "access"
}
}
filter {
grok {
match => {
"message" => '%{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)"%{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NOTSPACE:querystring} (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})%{QS:agent}  %{IPORHOST:forwardedfor} %{IPORHOST:host} %{NUMBER:upstreamresponse} (?:-|%{NUMBER:cache})'
}
}
}
output {
stdout {}
}

但我得到了错误:

[2021-03-24T13:25:53,424][ERROR][logstash.javapipeline    ][main] Pipeline error {:pipeline_id=>"main", :exception=>#<Grok::PatternError: pattern %{NGUSER:ident} not defined>, :backtrace=>["/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:123:in `block in compile'", "org/jruby/RubyKernel.java:1442:in `loop'", "/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/jls-grok-0.11.5/lib/grok-pure.rb:93:in `compile'", "/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:282:in `block in register'", "org/jruby/RubyArray.java:1809:in `each'", "/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:276:in `block in register'", "org/jruby/RubyHash.java:1415:in `each'", "/home/ubuntu/logstash-7.12.0/vendor/bundle/jruby/2.5.0/gems/logstash-filter-grok-4.4.0/lib/logstash/filters/grok.rb:271:in `register'", "org/logstash/config/ir/compiler/AbstractFilterDelegatorExt.java:75:in `register'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:228:in `block in register_plugins'", "org/jruby/RubyArray.java:1809:in `each'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:227:in `register_plugins'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:586:in `maybe_setup_out_plugins'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:240:in `start_workers'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:185:in `run'", "/home/ubuntu/logstash-7.12.0/logstash-core/lib/logstash/java_pipeline.rb:137:in `block in start'"], "pipeline.sources"=>["/home/ubuntu/logstash-7.12.0/config/pipeline.conf"], :thread=>"#<Thread:0x175b972c run>"}

在这种特殊情况下,我想使NGUSER等于[a-zA-Z.@-+_%]+

所以我的问题来了:我在哪里定义自定义grok标签?

您可以将pattern_definitions选项用于grok过滤器

grok {
pattern_definition => {
"NGUSER" => "[a-zA-Z.@-+_%]+"
}
match => { "message" => '%{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)"%{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NOTSPACE:querystring} (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})%{QS:agent}  %{IPORHOST:forwardedfor} %{IPORHOST:host} %{NUMBER:upstreamresponse} (?:-|%{NUMBER:cache})' }
}

或者,如果你认为你可能想在多个实例中共享模式,你可以使用patterns_dir选项

grok {
patterns_dir => [ "/home/user/patterns/" ]
match => { "message" => '%{IPORHOST:clientip} %{NGUSER:ident} %{NGUSER:auth} [%{HTTPDATE:timestamp}] "(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|-)"%{NUMBER:response} (?:%{NUMBER:bytes}|-) %{NOTSPACE:querystring} (?:"(?:%{URI:referrer}|-)"|%{QS:referrer})%{QS:agent}  %{IPORHOST:forwardedfor} %{IPORHOST:host} %{NUMBER:upstreamresponse} (?:-|%{NUMBER:cache})' }
}

并在该目录中创建一个包含的文件

NGUSER [a-zA-Z.@-+_%]+