我研究了elasticsearch聚合查询,但找不到它是否支持多个聚合函数。另一个单词,我想知道Elasticsearch是否可以生成此SQL聚合查询的等效:
SELECT account_no, transaction_type, count(account_no), sum(amount), max(amount) FROM index_name GROUP BY account_no, transaction_type Having count(account_no) > 10
如果是,如何?谢谢。
有两种可能的方法可以在ES中做您要寻找的事情,而我在下面都提到了它们。
我还添加了示例映射和示例文档以供您参考。
映射:
PUT index_name
{
"mappings": {
"mydocs":{
"properties":{
"account_no":{
"type": "keyword"
},
"transaction_type":{
"type": "keyword"
},
"amount":{
"type":"double"
}
}
}
}
}
示例文档:
仔细注意,我只为1个客户创建4个交易列表。
POST index_name/mydocs/1
{
"account_no": "1011",
"transaction_type":"credit",
"amount": 200
}
POST index_name/mydocs/2
{
"account_no": "1011",
"transaction_type":"credit",
"amount": 400
}
POST index_name/mydocs/3
{
"account_no": "1011",
"transaction_type":"cheque",
"amount": 100
}
POST index_name/mydocs/4
{
"account_no": "1011",
"transaction_type":"cheque",
"amount": 100
}
有两种方法可以获取您要寻找的东西:
解决方案1:使用Elasticsearch查询DSL
聚合查询:
对于聚合查询DSL,我使用以下集合查询来解决您的需求。
- 术语汇总
- 总和查询(度量集合(
- 最大聚合查询(度量集合(
以下是查询的查询版本的汇总版本,因此您可以清楚地了解 sibling parters 。。
- Terms Aggregation (For Every Account)
- Terms Aggregation (For Every Transaction_type)
- Sum Amount
- Max Amount
以下是实际查询:
POST index_name/_search
{
"size": 0,
"aggs": {
"account_no_agg": {
"terms": {
"field": "account_no"
},
"aggs": {
"transaction_type_agg": {
"terms": {
"field": "transaction_type",
"min_doc_count": 2
},
"aggs": {
"sum_amount": {
"sum": {
"field": "amount"
}
},
"max_amount":{
"max": {
"field": "amount"
}
}
}
}
}
}
}
}
重要的是要提及的是 min_doc_count
它不过是 having count(account_no)>10
,在我的查询中,我仅过滤having count(account_no) > 2
查询响应
{
"took" : 5,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : 4,
"max_score" : 0.0,
"hits" : [ ]
},
"aggregations" : {
"account_no_agg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "1011", <---- account_no
"doc_count" : 4, <---- count(account_no)
"transaction_type_agg" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "cheque", <---- transaction_type
"doc_count" : 2,
"sum_amount" : { <---- sum(amount)
"value" : 200.0
},
"max_amount" : { <---- max(amount)
"value" : 100.0
}
},
{
"key" : "credit", <---- another transaction_type
"doc_count" : 2,
"sum_amount" : { <---- sum(amount)
"value" : 600.0
},
"max_amount" : { <---- max(amount)
"value" : 400.0
}
}
]
}
}
]
}
}
}
请注意以上结果,我在需要的任何地方都添加了评论,以帮助您想要的SQL查询部分。
解决方案2:使用Elasticsearch SQL(_xpack解决方案(
如果您使用Elasticsearch的SQL访问的XPACK功能,则可以简单地复制粘贴查询如下所述的映射和文档:
:elasticsearch sql:
POST /_xpack/sql?format=txt
{
"query": "SELECT account_no, transaction_type, sum(amount), max(amount), count(account_no) FROM index_name GROUP BY account_no, transaction_type HAVING count(account_no) > 1"
}
Elasticsearch SQL结果:
account_no |transaction_type| SUM(amount) | MAX(amount) |COUNT(account_no)
---------------+----------------+---------------+---------------+-----------------
1011 |cheque |200.0 |100.0 |2
1011 |credit |600.0 |400.0 |2
请注意,我已经在ES 6.5.4中测试了查询。
希望这会有所帮助!