我有一个结构如下的文档
{
id:1,
leaves:[
{
"reason":"",
"date":"2019-01-01"
},
{
"reason":"",
"date":"2019-04-30"
}
]
}
叶子是一个嵌套文档。可以更改文档结构。 我需要选择给定范围内少于 10 次休假的员工 -2019-01-01 至 2019-05-30。
我尝试了存储桶选择器聚合,但"min_bucket"存储桶路径没有指向空存储桶(在范围内没有叶子的地方需要)。我得到的回复低于,没有返回任何记录。
"max_hourly_inner" : {
"value" : null,
"keys" : [ ]
}
我想出了下面的查询。在嵌套上执行聚合时有点棘手,但是您可以通过我使用的以下聚合来实现它。
- 术语聚合
- 嵌套聚合
- 日期范围聚合
- 存储桶选择器聚合
我正在解决的等式是向我显示指定日期范围内少于 2 个叶子的学生列表,即从2019-04-01
到2019-05-30
示例文档:
// This student has 3 leaves over all and all 3 leaves in the specified date
POST myleaves/_doc/1
{
"id": 1001,
"leaves" : [
{
"reason" : "",
"date" : "2019-04-01"
},
{
"reason" : "",
"date" : "2019-04-29"
},
{
"reason" : "",
"date" : "2019-04-30"
}
]
}
//This student has 4 leaves out of which 2 are in specified date range
POST myleaves/_doc/2
{
"id": 1002,
"leaves" : [
{
"reason" : "",
"date" : "2019-04-01"
},
{
"reason" : "",
"date" : "2019-04-04"
},
{
"reason" : "",
"date" : "2019-07-29"
},
{
"reason" : "",
"date" : "2019-07-30"
}
]
}
// This student has one leave but no leaves in specified date range
POST myleaves/_doc/3
{
"id": 1003,
"leaves":[
{
"reason" : "",
"date" : "2019-07-29"
}
]
}
//This student has no leaves at all
POST myleaves/_doc/4
{
"id": 1004,
"leaves":[
]
}
下面是聚合查询的结构
- Terms Aggregation on `id` field
- Nested Aggregation on `leaves` field
- Date Range aggregation on `leaves.date` field
- Bucket Selector Aggregation on `count`. This is the part where we specify our condition
- Bucket Selector Aggregation to retrieve only documents having one bucket. (To avoid showing bucket with 0 doc counts)
聚合查询:
POST <your_index_name>/_search
{
"size":0,
"aggs":{
"mystudents":{
"terms":{
"field":"id",
"size":10
},
"aggs":{
"mycount":{
"nested":{
"path":"leaves"
},
"aggs": {
"valid_dates": {
"date_range": {
"field": "leaves.date",
"ranges": [
{
"from": "2019-04-01",
"to": "2019-05-30"
}
]
},
"aggs": {
"myselector": {
"bucket_selector": {
"buckets_path": {
"myparams": "_count"
},
"script": "params.myparams <= 2" <---- You may have to change this for less than 10 leaves params.myparams <=10
}
}
}
}
}
},
"mybucket_selector":{
"bucket_selector":{
"buckets_path":{
"my_bucket_count":"mycount>valid_dates._bucket_count"
},
"script":"params.my_bucket_count == 1"
}
}
}
}
}
}
请注意我在聚合查询中提到的注释。
聚合响应:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"mystudents" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 1002,
"doc_count" : 1,
"mycount" : {
"doc_count" : 4, <----- Total Count of Leaves
"valid_dates" : {
"buckets" : [
{
"key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
"from" : 1.5540768E12,
"from_as_string" : "2019-04-01T00:00:00.000Z",
"to" : 1.5591744E12,
"to_as_string" : "2019-05-30T00:00:00.000Z",
"doc_count" : 2 <------ Count of leaves in specified range
}
]
}
}
},
{
"key" : 1003,
"doc_count" : 1,
"mycount" : {
"doc_count" : 1,
"valid_dates" : {
"buckets" : [
{
"key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
"from" : 1.5540768E12,
"from_as_string" : "2019-04-01T00:00:00.000Z",
"to" : 1.5591744E12,
"to_as_string" : "2019-05-30T00:00:00.000Z",
"doc_count" : 0
}
]
}
}
},
{
"key" : 1004,
"doc_count" : 1,
"mycount" : {
"doc_count" : 0,
"valid_dates" : {
"buckets" : [
{
"key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
"from" : 1.5540768E12,
"from_as_string" : "2019-04-01T00:00:00.000Z",
"to" : 1.5591744E12,
"to_as_string" : "2019-05-30T00:00:00.000Z",
"doc_count" : 0
}
]
}
}
}
]
}
}
}
如果你看一下回应,
1001
没有出现,因为他在指定的日期范围内有超过 2 片叶子,- 出现
1002
是因为他在指定日期范围内拍摄的 4 片叶子中正好有 2 片叶子 1003
和1004
出现,因为它们没有在指定范围内取任何叶子。
条款是选择在指定日期范围内休假少于 2 次的学生(包括未请任何假期的学生)。
希望这有帮助!