我在访问logstash(最新版本)中的嵌套JSON字段时遇到问题。
我的配置文件如下:
input {
http {
port => 5001
codec => "json"
}
}
filter {
mutate {
add_field => {"es_index" => "%{[statements][authority][name]}"}
}
mutate {
gsub => [
"es_index", " ", "_"
]
}
mutate {
lowercase => ["es_index"]
}
ruby {
init => "
def remove_dots hash
new = Hash.new
hash.each { |k,v|
if v.is_a? Hash
v = remove_dots(v)
end
new[ k.gsub('.','_') ] = v
if v.is_a? Array
v.each { |elem|
if elem.is_a? Hash
elem = remove_dots(elem)
end
new[ k.gsub('.','_') ] = elem
} unless v.nil?
end
} unless hash.nil?
return new
end
"
code => "
event.instance_variable_set(:@data,remove_dots(event.to_hash))
"
}
}
output {
stdout {
codec => rubydebug
}
elasticsearch {
hosts => "elasticsearch:9200"
index => "golab-%{+YYYY.MM.dd}"
}
}
我有一个带变异的过滤器。我想添加一个可以用作索引名称一部分的字段。当我使用这个"%{[statements][authority][name]}"
时,括号中的内容被用作字符串。CCD_ 2保存在CCD_ 3字段中。Logstash似乎认为这是一个字符串,但为什么呢?
我也尝试过使用这个表达式:"%{statements}"
。它像预期的那样工作。字段语句中的所有内容都传递给es_index
。如果我使用"%{[statements][authority]}"
,奇怪的事情就会发生。CCD_ 7填充有与CCD_ 8产生的输出完全相同的输出。我错过了什么?
使用"%{[statements][authority]}"
:的Logstash输出
{
"statements" => {
"verb" => {
"id" => "http://adlnet.gov/expapi/verbs/answered",
"display" => {
"en-US" => "answered"
}
},
"version" => "1.0.1",
"timestamp" => "2016-07-21T07:41:18.013880+00:00",
"object" => {
"definition" => {
"name" => {
"en-US" => "Example Activity"
},
"description" => {
"en-US" => "Example activity description"
}
},
"id" => "http://adlnet.gov/expapi/activities/example"
},
"actor" => {
"account" => {
"homePage" => "http://example.com",
"name" => "xapiguy"
},
"objectType" => "Agent"
},
"stored" => "2016-07-21T07:41:18.013880+00:00",
"authority" => {
"mbox" => "mailto:info@golab.eu",
"name" => "GoLab",
"objectType" => "Agent"
},
"id" => "0771b9bc-b1b8-4cb7-898e-93e8e5a9c550"
},
"id" => "a7e31874-780e-438a-874c-964373d219af",
"@version" => "1",
"@timestamp" => "2016-07-21T07:41:19.061Z",
"host" => "172.23.0.3",
"headers" => {
"request_method" => "POST",
"request_path" => "/",
"request_uri" => "/",
"http_version" => "HTTP/1.1",
"http_host" => "logstasher:5001",
"content_length" => "709",
"http_accept_encoding" => "gzip, deflate",
"http_accept" => "*/*",
"http_user_agent" => "python-requests/2.9.1",
"http_connection" => "close",
"content_type" => "application/json"
},
"es_index" => "{"verb":{"id":"http://adlnet.gov/expapi/verbs/answered","display":{"en-us":"answered"}},"version":"1.0.1","timestamp":"2016-07-21t07:41:18.013880+00:00","object":{"definition":{"name":{"en-us":"example_activity"},"description":{"en-us":"example_activity_description"}},"id":"http://adlnet.gov/expapi/activities/example","objecttype":"activity"},"actor":{"account":{"homepage":"http://example.com","name":"xapiguy"},"objecttype":"agent"},"stored":"2016-07-21t07:41:18.013880+00:00","authority":{"mbox":"mailto:info@golab.eu","name":"golab","objecttype":"agent"},"id":"0771b9bc-b1b8-4cb7-898e-93e8e5a9c550"}"
}
您可以看到,权威是es_index
的一部分。因此,它没有被选为一个领域。
提前感谢
我找到了一个解决方案。学分转到jpcarey(弹性搜索论坛)
我不得不删除codec => "json"
。这导致了另一种数据结构。statements
现在是一个数组,而不是一个对象。所以我需要将%{[statements][authority][name]}
更改为%{[statements][0][authority][name]}
。这毫无问题。
如果你按照给定的链接,你还会发现我的mutate
过滤器的更好实现。