我有带钥匙(时间,设施(的桌子。对于每个密钥,存储从传感器接收的(值(。例如:
┌─date───────┬facility┬value┬
│ 2017-09-09 │ 10002 │ 10 │
│ 2017-09-10 │ 10001 │ 12 │
│ 2017-09-12 │ 10002 │ 15 │
│ 2017-09-15 │ 10001 │ 17 │
└────────────┴────────┘─────┘
我想计算当前值与前一个值相比的增加。像这样:
┌─date───────┬facility┬─value─┬
│ 2017-09-12 │ 10002 │ 15/10 │ // 15 current, 10 previous for facility 10002
│ 2017-09-15 │ 10001 │ 17/12 │ // 17 current, 12 previous for facility 10002
└────────────┴────────┘───────┘
为了得到这个结果,我需要使用 JOIN,但正如我从文档中了解到的那样,JOIN 操作中的比较条件仅用于相等,不能使用不等式。
当然,我可以做一个笛卡尔乘积,然后在 WHERE 部分进行日期比较(获取所有小于当前日期的日期(,然后从过滤后的日期中选择最大日期。但这非常耗时和内存!
请帮我创建一个最佳查询,因为这种情况是 IOT 的典型情况。
///问题扩展////////////////////////////////////////////
我还想计算与上一时间每行的值相比的增量值(例如,1000 行中的每一行(。如果我想跟踪传感器跳跃,这种情况是典型的。例:
┌─date───────┬facility┬value┬
│ 2017-09-01 │ 10002 │ 3 │
│ 2017-09-02 │ 10001 │ 4 │
│ 2017-09-04 │ 10002 │ 1 │
│ 2017-09-05 │ 10001 │ 2 │
│ 2017-09-09 │ 10002 │ 10 │
│ 2017-09-10 │ 10001 │ 12 │
│ 2017-09-12 │ 10002 │ 15 │
│ 2017-09-15 │ 10001 │ 17 │
└────────────┴────────┘─────┘
┌─date───────┬facility┬─value─┬
│ 2017-09-04 │ 10002 │ 1/3 │ // 1 current, 3 previous for facility 10002
│ 2017-09-05 │ 10001 │ 2/4 │ // 2 current, 4 previous for facility 10001
│ 2017-09-09 │ 10002 │ 10/1 │ // 10 current, 1 previous for facility 10002
│ 2017-09-10 │ 10001 │ 12/2 │ // 12 current, 2 previous for facility 10001
│ 2017-09-12 │ 10002 │ 15/10 │ // 15 current, 10 previous for facility 10002
│ 2017-09-15 │ 10001 │ 17/12 │ // 17 current, 12 previous for facility 10001
└────────────┴────────┘───────┘
我的表具有以下结构:
CREATE TABLE SensorsLogs(
facility_id UInt64,
parameter_id UInt64,
timeSec DateTime('Asia/Yekaterinburg'),
value Float32 )
ENGINE = MergeTree()
PARTITION BY toYYYYMM(timeSec)
ORDER BY (facility_id, parameter_id, timeSec);
select groupArray(1)(date)[1], facility, groupArray(2)(value) from
(select date, facility, value from
(select '2017-09-09' date, 10002 facility, 10 value union all
select '2017-09-10' , 10001, 12 union all
select '2017-09-12' , 10002, 15 union all
select '2017-09-15' , 10001, 17 )
order by facility, date desc)
group by facility
┌─arrayElement(groupArray(1)(date), 1)─┬─facility─┬─groupArray(2)(value)─┐
│ 2017-09-15 │ 10001 │ [17,12] │
│ 2017-09-12 │ 10002 │ [15,10] │
└──────────────────────────────────────┴──────────┴──────────────────────┘
SELECT
toDate(max(time)) AS date,
facility,
argMax(value, time) AS current_value,
argMin(value, time) AS previous_value,
toString(current_value) || '/' || toString(previous_value) AS values
FROM (
SELECT *
FROM (
/* emulate the original table */
SELECT toDateTime(test_set.1) as time, test_set.2 facility, test_set.3 value
FROM (
SELECT arrayJoin([
('2017-09-09 10:11:12', 10002, 10),
('2017-09-10 11:11:12', 10001, 12),
('2017-09-12 12:11:12', 10002, 15),
('2017-09-15 13:11:12', 10001, 17),
('2017-09-02 14:11:12', 10002, 5),
('2017-09-02 15:11:12', 10001, 7),
('2017-09-01 16:11:12', 10002, 7),
('2017-09-01 17:11:12', 10001, 9)]) test_set))
/*WHERE time BETWEEN .. AND ..*/
ORDER BY facility, time DESC
LIMIT 2 BY facility)
GROUP BY facility
ORDER BY date;
/* result
┌───────date─┬─facility─┬─current_value─┬─previous_value─┬─values─┐
│ 2017-09-12 │ 10002 │ 15 │ 10 │ 15/10 │
│ 2017-09-15 │ 10001 │ 17 │ 12 │ 17/12 │
└────────────┴──────────┴───────────────┴────────────────┴────────┘
*/
查询以查找每行的上一个值(将使用相邻窗口函数(:
SELECT time, facility, value,
neighbor(facility, -1, -1) previous_facility,
(facility != previous_facility ? -1 : neighbor(value, -1, -1)) previous_value,
toString(value) || '/' || (previous_value = -1 ? 'none' : toString(previous_value)) AS values
FROM (
SELECT *
FROM (
/* emulate the original table */
SELECT toDateTime(test_set.1) as time, test_set.2 facility, test_set.3 value
FROM (
SELECT arrayJoin([
('2017-09-01 01:02:03', 10002, 3 ),
('2017-09-02 01:02:03', 10001, 4 ),
('2017-09-04 01:02:03', 10002, 1 ),
('2017-09-05 01:02:03', 10001, 2 ),
('2017-09-09 01:02:03', 10002, 10),
('2017-09-10 01:02:03', 10001, 12),
('2017-09-12 01:02:03', 10002, 15),
('2017-09-15 01:02:03', 10001, 17)]) test_set))
/*WHERE time BETWEEN .. AND ..*/
ORDER BY facility, time)
ORDER BY time;
/* result:
┌────────────────time─┬─facility─┬─value─┬─previous_facility─┬─previous_value─┬─values─┐
│ 2017-09-01 01:02:03 │ 10002 │ 3 │ 10001 │ -1 │ 3/none │
│ 2017-09-02 01:02:03 │ 10001 │ 4 │ -1 │ -1 │ 4/none │
│ 2017-09-04 01:02:03 │ 10002 │ 1 │ 10002 │ 3 │ 1/3 │
│ 2017-09-05 01:02:03 │ 10001 │ 2 │ 10001 │ 4 │ 2/4 │
│ 2017-09-09 01:02:03 │ 10002 │ 10 │ 10002 │ 1 │ 10/1 │
│ 2017-09-10 01:02:03 │ 10001 │ 12 │ 10001 │ 2 │ 12/2 │
│ 2017-09-12 01:02:03 │ 10002 │ 15 │ 10002 │ 10 │ 15/10 │
│ 2017-09-15 01:02:03 │ 10001 │ 17 │ 10001 │ 12 │ 17/12 │
└─────────────────────┴──────────┴───────┴───────────────────┴────────────────┴────────┘
*/
类似的查询,但使用运行差异:
SELECT time, facility, value,
neighbor(facility, -1, -1) previous_facility,
runningDifference(value) delta,
(facility != previous_facility ? -1 : value - delta) AS previous_value,
toString(value) || '/' || (previous_value = -1 ? 'none' : toString(previous_value)) AS values
FROM (
SELECT *
FROM (
/* emulate the original table */
SELECT toDateTime(test_set.1) as time, test_set.2 facility, test_set.3 value
FROM (
SELECT arrayJoin([
('2017-09-01 01:02:03', 10002, 3 ),
('2017-09-02 01:02:03', 10001, 4 ),
('2017-09-04 01:02:03', 10002, 1 ),
('2017-09-05 01:02:03', 10001, 2 ),
('2017-09-09 01:02:03', 10002, 10),
('2017-09-10 01:02:03', 10001, 12),
('2017-09-12 01:02:03', 10002, 15),
('2017-09-15 01:02:03', 10001, 17)]) test_set))
/*WHERE time BETWEEN .. AND ..*/
ORDER BY facility, time)
ORDER BY time;
/*
┌────────────────time─┬─facility─┬─value─┬─previous_facility─┬─delta─┬─previous_value─┬─values─┐
│ 2017-09-01 01:02:03 │ 10002 │ 3 │ 10001 │ -14 │ -1 │ 3/none │
│ 2017-09-02 01:02:03 │ 10001 │ 4 │ -1 │ 0 │ -1 │ 4/none │
│ 2017-09-04 01:02:03 │ 10002 │ 1 │ 10002 │ -2 │ 3 │ 1/3 │
│ 2017-09-05 01:02:03 │ 10001 │ 2 │ 10001 │ -2 │ 4 │ 2/4 │
│ 2017-09-09 01:02:03 │ 10002 │ 10 │ 10002 │ 9 │ 1 │ 10/1 │
│ 2017-09-10 01:02:03 │ 10001 │ 12 │ 10001 │ 10 │ 2 │ 12/2 │
│ 2017-09-12 01:02:03 │ 10002 │ 15 │ 10002 │ 5 │ 10 │ 15/10 │
│ 2017-09-15 01:02:03 │ 10001 │ 17 │ 10001 │ 5 │ 12 │ 17/12 │
└─────────────────────┴──────────┴───────┴───────────────────┴───────┴────────────────┴────────┘
*/
键(时间、设施(不太适合如上所述的计算。
如果它是主要用例,并且在所需的日期范围内运行得很慢,请考虑将密钥更改为(设施、时间(或使用数据跳过索引。