Elasticsearch按字段(空字段值除外)对项进行分组



例如,我有一个集合:

[
{
id: 1,
title: 'photo_1'
photo_id: 10
},
{
id: 2,
title: 'photo_2'
photo_id: 10
},
{
id: 3,
title: 'photo_3'
photo_id: null
},
{
id: 4,
title: 'photo_4'
photo_id: null
}
]

我想按除空photo_id之外的photo_id进行分组,以获得下一个集合:

[
{
id: 1,
title: 'photo_1'
photo_id: 10,
inner_hits: ...
},
{
id: 3,
title: 'photo_3'
photo_id: null
},
{
id: 4,
title: 'photo_4'
photo_id: null
}
]

如果有人知道怎么做?请(折叠组公牛值也,我还需要使用分页和排序

更新:

目前我正在使用这个查询

query = {
from: 0,
size: 50,
sort: [{created_at: :desc}],
query: {},
collapse: {
field: 'photo_id',
inner_hits: {
name: "items"
},
}
}

但是它也折叠photo_id为null的项目。有没有办法用photoid只压缩结果?

我想得到这样的结果:

[
{
id: 1,
title: 'photo_1'
photo_id: 10,
inner_hits: ...
},
{
id: 3,
title: 'photo_3'
photo_id: null
},
{
id: 4,
title: 'photo_4'
photo_id: null
}
]

据我所知,您需要根据photo_id值对文档进行分组,然后在结果中包括那些将null作为photo_id字段值的文档。

为了实现这一点,您需要将术语聚合与过滤器聚合以及热门搜索聚合相结合,以获得响应中的聚合文档。

搜索查询:

{
"size": 0,
"aggs": {
"agg1": {
"terms": {
"field": "photo_id"
},
"aggs": {
"unique_not_null_photo_id": {
"top_hits": {
"_source": {
"includes": [
"photo_id",
"title"
]
},
"size": 1
}
}
}
},
"aggs": {
"filter": {
"bool": {
"must_not": {
"exists": {
"field": "photo_id"
}
}
}
},
"aggs": {
"null_photo_id": {
"top_hits": {
"_source": {
"includes": [
"photo_id",
"title"
]
}
}
}
}
}
}
}

搜索结果:

"aggregations" : {
"agg1" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 10,
"doc_count" : 2,
"unique_not_null_photo_id" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "index1",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"photo_id" : 10,
"title" : "photo_1"
}
}
]
}
}
}
]
},
"aggs" : {
"doc_count" : 2,
"null_photo_id" : {
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "index1",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"photo_id" : null,
"title" : "photo_3"
}
},
{
"_index" : "index1",
"_type" : "_doc",
"_id" : "4",
"_score" : 1.0,
"_source" : {
"photo_id" : null,
"title" : "photo_4"
}
}
]
}
}
}
}

如果您在应用程序端组合unique_not_null_photo_idbucket和null_photo_idbucket的响应,您将获得所需的搜索结果。