Grok 过滤器以从消息字段中选择特定单词

input {
file {
path => "C:Datadata.log"
start_position => "beginning"
sincedb_path => "NUL"
}
}
filter {
if [type] == "apache" {
grok {
match => ["message", "%{COMBINEDAPACHELOG} "]
}
}mutate{
remove_field => ["@timestamp"]
remove_field => ["host"]
remove_field => ["@version"]
remove_field => ["path"]
}   
}
output {
elasticsearch{
hosts => "localhost:9200"
index => "logdata2"
document_type => "logs"
}
stdout {codec => rubydebug}
}

这是我遇到的问题：

我只想挑出一些词，但一直无法正确理解。

我想要的只是获取一个带有时间戳的字符串，该字符串位于消息字符串中。以及另一个词，比如 OrderCreated。

是否可以通过这种方式从消息字段中选择特定的字符串/单词？

剖析工作得很好，但现在我遇到了一个我以前没有的问题。

dissect filter 
input {
file {
path => "C:DataLogstestrunning.log"
start_position => "beginning"
sincedb_path => "NUL"
}
}
filter {
dissect {
mapping => {
"message" => "%{ts} %{+ts} %{+ts} %{src} %{} : %{msg}"
}
}mutate { remove_field => "@timestamp" 
remove_field => "pid"
remove_field => "prog"
remove_field => "@version"
remove_field => "host"
remove_field => "path"
remove_field => "src"
} 
}
output {
elasticsearch{
hosts => "localhost:9200"
index => "logdata12"
document_type => "logs"
}
stdout {codec => rubydebug}
}

输出如下。这对我来说是新的，以前不存在的"\r"部分.. 这对任何人来说都很熟悉吗？如何修复此部分？

{
"message" => "General 2018-05-17 15:47:33.149 : StatusInformationSomeData.Unsubscribe() r",
"msg" => "StatusInformationSomeData.Unsubscribe() r",
"ts" => "General 2018-05-17 15:47:33.149"
}
{
"message" => "r",
"msg" => "r",
"ts" => "  "
}

您可以使用dissect{}.例如，如果您有如下日志行：

198.41.30.203 - - [21/May/2018:14:36:35 -0500] "GET /tag/eclipse/feed/ HTTP/1.1" 404 5232 "-" "UniversalFeedParser/4.2-pre-308-svn +http://feedparser.org/"

您的剖析可能是这样的，您甚至可以转换数据类型。剖析的表现比格罗克好得多：

dissect {
mapping => {
"message" => '%{source_ip} %{} %{username} [%{raw_timestamp}] "%{http_verb} %{http_path} %{http_version}" %{http_response} %{http_bytes} "%{site}" "%{useragent}"'
convert_datatype => {
http_bytes => "int"
}
} 
}

相关内容

最新更新

热门标签：