德鲁伊 - 带有 groupBy 查询的降序时间戳



我所要求的应该非常简单,但德鲁伊文档对此几乎没有信息。

我正在创建一个 groupBy 查询,并且数据非常大,所以我通过增加每个后续查询的 limitSpec.limit 来"分页"它。

默认情况下,返回的数组从开始时间戳开始,并在时间上向前移动。我希望结果从结束时间戳开始,然后从那里向后移动。

有谁知道该怎么做?

换句话说,默认情况下,groupBy 查询如下所示:

[ 
{
"version" : "v1",
"timestamp" : "2012-01-01T00:00:00.000Z",
"event" : {
"total_usage" : <some_value_one>
}
}, 
{
"version" : "v1",
"timestamp" : "2012-01-02T00:00:00.000Z",
"event" : {
"total_usage" : <some_value_two>
}
}
]

而我希望它看起来像这样:

[ 
{
"version" : "v1",
"timestamp" : "2012-01-02T00:00:00.000Z",
"event" : {
"total_usage" : <some_value_two>
}
}, 
{
"version" : "v1",
"timestamp" : "2012-01-01T00:00:00.000Z",
"event" : {
"total_usage" : <some_value_one>
}
}
]

您可以使用极限规范中的"列"属性来实现排序。 请参阅以下示例。

{
"type"    : "default",
"limit"   : <integer_value>,
"columns" : [list of OrderByColumnSpec],
}

有关更多详细信息,您可以参考以下德鲁伊文档 - http://druid.io/docs/latest/querying/limitspec.html

您可以将时间戳添加为维度,但截断为日期(假设您在查询中使用day粒度(,并强制 Druid 首先按维度值对结果进行排序,然后按时间戳排序。

示例查询:

{
"dataSource": "your_datasource",
"queryType": "groupBy",
"dimensions": [
{
"type": "default",
"dimension": "some_dimension_in",
"outputName": "some_dimension_out",
"outputType": "STRING"
},
{
"type": "extraction",
"dimension": "__time",
"outputName": "__timestamp",
"extractionFn": {
"type": "timeFormat",
"format" : "yyyy-MM-dd"
}
}
],
"aggregations": [
{
"type": "doubleSum",
"name": "some_metric",
"fieldName": "some_metric_field"
}
],
"limitSpec": {
"type": "default",
"limit": 1000,
"columns": [
{
"dimension": "__timestamp",
"direction": "descending",
"dimensionOrder": "numeric"
},
{
"dimension": "some_metric",
"direction": "descending",
"dimensionOrder": "numeric"
}
]
},
"intervals": [
"2019-09-01/2019-10-01"
],
"granularity": "day",
"context": {
"sortByDimsFirst": "true"
}
}

最新更新