ES聚合查询,根据ES中子文档聚合结果的条件提取父文档



我使用的是ES v7.3,我的索引中有父子映射,我想根据应用于子文档的一些聚合结果来限定索引中的父文档,但我无法做到这一点,这里是我需要的参考。。。

索引映射:

PUT example
{
"mappings": {
"properties": {
"join": {
"type": "join",
"relations": {
"user": "session"
}
}
}
}
}

父文档:

PUT example/_doc/1
{
"join": {
"name": "user"
},
"type": "identify",
"device": "xx"
"profileId": "1052210",
"updatedAt": "2020-12-30T17:06:22.851Z"
}

第一个子文档:

PUT example/_doc/2?routing=1
{
"join": {
"name": "session",
"parent": "1"
},
"page_view_count": 10,
"creation_date": "2020-12-30T13:45:37.851Z"
}

第二个子文档:

PUT example/_doc/3?routing=1
{
"join": {
"name": "session",
"parent": "1"
},
"page_view_count": 20,
"creation_date": "2020-12-30T13:45:37.851Z"
}

要求:

  1. 具有page_view_count的用户>25,所以我们想聚合子文档中的page_view_count,并检查它们的总和是否>25或否,如果它满足条件,那么我们应该得到父文档作为响应,否则就不是

注意:这些子文档是为每个用户会话形成的,因此计数在会话ES文档中维护。

这里有一个只返回profileId的解决方案:

POST example/_search
{
"size": 0, 
"aggs": {
"users": {
"terms": {
"field": "profileId.keyword",
"size": 10
},
"aggs": {
"sessions": {
"children": {
"type" : "session" 
},
"aggs": {
"last_7_days": {
"filter": {
"range": {
"creation_date": {
"gte": "now-7d",
"lte": "now"
}
}
},
"aggs": {
"page_view_count": {
"sum": {
"field": "page_view_count"
}
}
}
}
}
},
"page_view_count_filter": {
"bucket_selector": {
"buckets_path": {
"viewCount": "sessions > last_7_days > page_view_count"
},
"script": "params.viewCount >= 25"
}
}
}
}
}
}

最新更新