ElasticSearch在文档的列表属性内的最小值上最大Agg

我想对文档下的属性值进行Max聚合，该属性是复杂对象(键和值(的列表。这是我的数据：

[{
"id" : "1",
"listItems" : 
[
{
"key" : "li1",
"value" : 100
},
{
"key" : "li2",
"value" : 5000
}
]
},
{
"id" : "2",
"listItems" : 
[
{
"key" : "li3",
"value" : 200
},
{
"key" : "li2",
"value" : 2000
}
]
}]

当我在"0"上进行嵌套最大聚合时；listItems.value"；，我希望返回的最大值是200(而不是5000(，原因是我希望逻辑首先计算每个文档的listItems下的MIN值，然后对其进行最大聚合。有可能做这样的事吗？

谢谢。

搜索查询执行以下聚合：

id字段上的术语聚合
listItems.value上的最小聚合
最大存储桶聚合，它是一种同级管道聚合，用于识别具有同级聚合中指定度量的最大值的存储桶，并输出存储桶的值和密钥。

请参阅嵌套聚合，以获得有关它的详细解释。

添加一个具有索引数据、索引映射、搜索查询和搜索结果的工作示例。

索引映射：

{
"mappings": {
"properties": {
"listItems": {
"type": "nested" 
},
"id":{
"type":"text",
"fielddata":"true"
}
}
}
}

指数数据：

{
"id" : "1",
"listItems" : 
[
{
"key" : "li1",
"value" : 100
},
{
"key" : "li2",
"value" : 5000
}
]
}
{
"id" : "2",
"listItems" : 
[
{
"key" : "li3",
"value" : 200
},
{
"key" : "li2",
"value" : 2000
}
]
}

搜索查询：

{
"size": 0,
"aggs": {
"id_terms": {
"terms": {
"field": "id"
},
"aggs": {
"nested_entries": {
"nested": {
"path": "listItems"
},
"aggs": {
"min_position": {
"min": {
"field": "listItems.value"
}
}
}
}
}
},
"maxValue": {
"max_bucket": {
"buckets_path": "id_terms>nested_entries>min_position"
}
}
}
}

搜索结果：

"aggregations": {
"id_terms": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "1",
"doc_count": 1,
"nested_entries": {
"doc_count": 2,
"min_position": {
"value": 100.0
}
}
},
{
"key": "2",
"doc_count": 1,
"nested_entries": {
"doc_count": 2,
"min_position": {
"value": 200.0
}
}
}
]
},
"maxValue": {
"value": 200.0,
"keys": [
"2"
]
}
}

最初的帖子提到了嵌套聚合，所以我确信问题是关于嵌套文档的。由于我在看到另一个答案之前就已经找到了解决方案，所以我将保留整个过程，但实际上它的不同之处仅在于添加嵌套聚合。

整个过程可以这样解释：

将每个文档放入一个桶中
使用嵌套聚合可以对嵌套文档进行聚合
使用min聚合可以在所有文档嵌套文档中找到最小值，并以此来查找文档本身
最后，使用另一个聚合来计算先前聚合结果中的最大值

给定此设置：

// PUT /index
{
"mappings": {
"properties": {
"children": {
"type": "nested",
"properties": {
"value": {
"type": "integer"
}
}
}
}
}
}

// POST /index/_doc
{
"children": [
{ "value": 12 },
{ "value": 45 }
]
}

// POST /index/_doc
{
"children": [
{ "value": 7 },
{ "value": 35 }
]
}

我可以在请求中使用这些聚合来获得所需的值：

{
"size": 0,
"aggs": {
"document": {
"terms": {"field": "_id"},
"aggs": {
"children": {
"nested": {
"path": "children"
},
"aggs": {
"minimum": {
"min": {
"field": "children.value"
}
}
}
}
}
},
"result": {
"max_bucket": {
"buckets_path": "document>children>minimum"
}
}
}
}

{
"aggregations": {
"document": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "O4QxyHQBK5VO9CW5xJGl",
"doc_count": 1,
"children": {
"doc_count": 2,
"minimum": {
"value": 7.0
}
}
},
{
"key": "OoQxyHQBK5VO9CW5kpEc",
"doc_count": 1,
"children": {
"doc_count": 2,
"minimum": {
"value": 12.0
}
}
}
]
},
"result": {
"value": 12.0,
"keys": [
"OoQxyHQBK5VO9CW5kpEc"
]
}
}
}

还应该有一个使用脚本计算最大值的变通方法——您所需要做的就是在这样的脚本中查找并返回文档中的最小值。

相关内容

最新更新

热门标签：