选择在给定日期范围内休假少于 10 的员工



我有一个结构如下的文档

{
id:1,
leaves:[
{
"reason":"",
"date":"2019-01-01"
},
{
"reason":"",
"date":"2019-04-30"
}
]
}

叶子是一个嵌套文档。可以更改文档结构。 我需要选择给定范围内少于 10 次休假的员工 -2019-01-01 至 2019-05-30。

我尝试了存储桶选择器聚合,但"min_bucket"存储桶路径没有指向空存储桶(在范围内没有叶子的地方需要)。我得到的回复低于,没有返回任何记录。

"max_hourly_inner" : {
"value" : null,
"keys" : [ ]
}

我想出了下面的查询。在嵌套上执行聚合时有点棘手,但是您可以通过我使用的以下聚合来实现它。

  • 术语聚合
  • 嵌套聚合
  • 日期范围聚合
  • 存储桶选择器聚合

我正在解决的等式是向我显示指定日期范围内少于 2 个叶子的学生列表,即从2019-04-012019-05-30

示例文档:

// This student has 3 leaves over all and all 3 leaves in the specified date 
POST myleaves/_doc/1
{
"id": 1001,
"leaves" : [
{
"reason" : "",
"date" : "2019-04-01"
},
{
"reason" : "",
"date" : "2019-04-29"
},
{
"reason" : "",
"date" : "2019-04-30"
}
]
}
//This student has 4 leaves out of which 2 are in specified date range
POST myleaves/_doc/2
{
"id": 1002,
"leaves" : [
{
"reason" : "",
"date" : "2019-04-01"
},
{
"reason" : "",
"date" : "2019-04-04"
},
{
"reason" : "",
"date" : "2019-07-29"
},
{
"reason" : "",
"date" : "2019-07-30"
}
]
}
// This student has one leave but no leaves in specified date range
POST myleaves/_doc/3
{
"id": 1003,
"leaves":[
{
"reason" : "",
"date" : "2019-07-29"
}
]
}
//This student has no leaves at all
POST myleaves/_doc/4
{
"id": 1004,
"leaves":[
]
}

下面是聚合查询的结构

- Terms Aggregation on `id` field
- Nested Aggregation on `leaves` field
- Date Range aggregation on `leaves.date` field
- Bucket Selector Aggregation on `count`. This is the part where we specify our condition 
- Bucket Selector Aggregation to retrieve only documents having one bucket. (To avoid showing bucket with 0 doc counts) 

聚合查询:

POST <your_index_name>/_search
{  
"size":0,
"aggs":{  
"mystudents":{  
"terms":{  
"field":"id",
"size":10
},
"aggs":{  
"mycount":{  
"nested":{  
"path":"leaves"
},
"aggs": {
"valid_dates": {
"date_range": {
"field": "leaves.date",
"ranges": [
{
"from": "2019-04-01",
"to": "2019-05-30"
}
]
},
"aggs": {
"myselector": {
"bucket_selector": {
"buckets_path": {
"myparams": "_count"
},
"script": "params.myparams <= 2"    <---- You may have to change this for less than 10 leaves params.myparams <=10
}
}
}
}
}
},
"mybucket_selector":{  
"bucket_selector":{  
"buckets_path":{  
"my_bucket_count":"mycount>valid_dates._bucket_count"
},
"script":"params.my_bucket_count == 1"
}
}
}
}
}
}

请注意我在聚合查询中提到的注释。

聚合响应:

{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"mystudents" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 1002,
"doc_count" : 1,
"mycount" : {
"doc_count" : 4,                                 <----- Total Count of Leaves 
"valid_dates" : {
"buckets" : [
{
"key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
"from" : 1.5540768E12,
"from_as_string" : "2019-04-01T00:00:00.000Z",
"to" : 1.5591744E12,
"to_as_string" : "2019-05-30T00:00:00.000Z",
"doc_count" : 2                            <------ Count of leaves in specified range
}
]
}
}
},
{
"key" : 1003,
"doc_count" : 1,
"mycount" : {
"doc_count" : 1,
"valid_dates" : {
"buckets" : [
{
"key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
"from" : 1.5540768E12,
"from_as_string" : "2019-04-01T00:00:00.000Z",
"to" : 1.5591744E12,
"to_as_string" : "2019-05-30T00:00:00.000Z",
"doc_count" : 0
}
]
}
}
},
{
"key" : 1004,
"doc_count" : 1,
"mycount" : {
"doc_count" : 0,
"valid_dates" : {
"buckets" : [
{
"key" : "2019-04-01T00:00:00.000Z-2019-05-30T00:00:00.000Z",
"from" : 1.5540768E12,
"from_as_string" : "2019-04-01T00:00:00.000Z",
"to" : 1.5591744E12,
"to_as_string" : "2019-05-30T00:00:00.000Z",
"doc_count" : 0
}
]
}
}
}
]
}
}
}

如果你看一下回应,

  • 1001没有出现,因为他在指定的日期范围内有超过 2 片叶子,
  • 出现1002是因为他在指定日期范围内拍摄的 4 片叶子中正好有 2 片叶子
  • 10031004出现,因为它们没有在指定范围内取任何叶子。

条款是选择在指定日期范围内休假少于 2 次的学生(包括未请任何假期的学生)。

希望这有帮助!

最新更新