多字段术语聚合方法



我有一个文档索引,如下所示:

[
    {
        "name": "Marco",
        "city_id": 45,
        "city": "Rome"
    },
    {
        "name": "John",
        "city_id": 46,
        "city": "London"
    },
    {
        "name": "Ann",
        "city_id": 47,
        "city": "New York"
    },
    ...
]

和聚合:

"aggs": {
    "city": {
        "terms": {
            "field": "city"
        }
    }
}

得到这样的响应:

{
    "aggregations": {    
        "city": {
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 694,
            "buckets": [
                {
                    "key": "Rome",
                    "doc_count": 15126
                },
                {
                    "key": "London",
                    "doc_count": 11395
                },
                {
                    "key": "New York",
                    "doc_count": 14836
                },
                ...
          ]
        },
        ...
    }
}

我的问题是,我需要有city_id对我的聚合结果以及。我一直在这里读到,我不能有多字段的术语聚合,但我不需要聚合两个字段,而是简单地返回另一个字段,每个字段(基本上是一个城市/city_id对)总是相同的。在不损失性能的情况下实现这一目标的最佳方法是什么?

我可以创建一个名为city_with_id的字段,其值如"Rome;45", "London;46"等,并通过该字段进行聚合。对我来说,这是可行的,因为我可以简单地在后端拆分结果并获得我需要的ID,但这是最好的方法吗?

一种方法是使用top_hits并使用源过滤只返回city_id,如下面的示例所示。我不认为这会降低性能在尝试OP.

中指定的city_name_id字段的方法之前,您可以在您的索引上尝试一下,看看影响。

的例子:

    post <index>/_search
    {
        "size" : 0,
        "aggs": {
            "city": {
                "terms": {
                    "field": "city"
                },
                "aggs" : {
                    "id" : {
                        "top_hits" : {
                            "_source": {
                                "include": [
                                    "city_id"
                                ]
                            },
                            "size" : 1
                        }
                    }
                }
            }
        }
    }

结果:

 {
               "key": "London",
               "doc_count": 2,
               "id": {
                  "hits": {
                     "total": 2,
                     "max_score": 1,
                     "hits": [
                        {
                           "_index": "country",
                           "_type": "city",
                           "_id": "2",
                           "_score": 1,
                           "_source": {
                              "city_id": 46
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": "New York",
               "doc_count": 1,
               "id": {
                  "hits": {
                     "total": 1,
                     "max_score": 1,
                     "hits": [
                        {
                           "_index": "country",
                           "_type": "city",
                           "_id": "3",
                           "_score": 1,
                           "_source": {
                              "city_id": 47
                           }
                        }
                     ]
                  }
               }
            },
            {
               "key": "Rome",
               "doc_count": 1,
               "id": {
                  "hits": {
                     "total": 1,
                     "max_score": 1,
                     "hits": [
                        {
                           "_index": "country",
                           "_type": "city",
                           "_id": "1",
                           "_score": 1,
                           "_source": {
                              "city_id": 45
                           }
                        }
                     ]
                  }
               }
            }

相关内容

最新更新