Ruby过滤器插件为单个输入json创建两条记录



有两个conf文件用于将两个json文件testOrders和testItems中的数据加载到同一索引中,每个文件只包含一个文档。我正在尝试在两个文档之间创建父子关系。

以下是我对测试订单的确认

input{
file{
path => ["/path_data/testOrders.json"]
type => "json"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
json {
source => "message"
target => "testorders_collection"
remove_field => [ "message" ]
}
ruby {
code => "
event.set('[my_join_field][name]', 'testorders')
"
}
}

output { 
elasticsearch { 
hosts => ["localhost:9200"]
index => "testorder"
document_id => "%{[testorders_collection][eId]}"
routing => "%{[testorders_collection][eId]}"
}
}

以下是testItems 的conf

input{
file{
path => ["/path_to_data/testItems.json"]
type => "json"
start_position => "beginning"
sincedb_path => "/dev/null"
}
}
filter {
json {
source => "message"
target => "test_collection"
remove_field => [ "message" ]
}
}
filter {
ruby {
code => "
event.set('[my_join_field][name]', 'testItems')
event.set('[my_join_field][parent]', event.get('[test_collection][foreignKeyId]'))
"
}
}
output { 
elasticsearch { 
hosts => ["localhost:9200"]
index => "testorder"
document_id => "%{[test_collection][eId]}"
routing => "%{[test_collection][foreignKeyId]}"
}
}

正如预期的那样,logstash为testOrders创建了1条记录,但为testItems创建了2条记录,给testOrders和testItems各提供了1个json文档。一个文档是用数据正确创建的,但另一个文档创建为重复文档,并且似乎没有数据。使用未解析的数据创建的文档如下所示

{
"_index": "testorder",
"_type": "doc",
"_id": "%{[test_collection][eId]}",
"_score": 1,
"_routing": "%{[test_collection][foreignKeyId]}",
"_source": {
"type": "json",
"@timestamp": "2018-07-10T04:15:58.494Z",
"host": "<hidden>",
"test_collection": null,
"my_join_field": {
"name": "testItems",
"parent": null
},
"path": "/path_to_data/testItems.json",
"@version": "1"
}

在弹性搜索中定义映射关系解决了这个问题。这是定义关系的方法

PUT fulfillmentorder
{
"mappings": {
"doc": {
"properties": {
"my_join_field": { 
"type": "join",
"relations": {
"fulfillmentorders": "orderlineitems" 
}
}
}
}
}
}

最新更新