获取上一个时间点的值.点击屋.物联网案例



我有带钥匙(时间,设施(的桌子。对于每个密钥,存储从传感器接收的(值(。例如:

┌─date───────┬facility┬value┬  
│ 2017-09-09 │ 10002  │ 10  │  
│ 2017-09-10 │ 10001  │ 12  │  
│ 2017-09-12 │ 10002  │ 15  │  
│ 2017-09-15 │ 10001  │ 17  │  
└────────────┴────────┘─────┘

我想计算当前值与前一个值相比的增加。像这样:

┌─date───────┬facility┬─value─┬  
│ 2017-09-12 │ 10002  │ 15/10 │ // 15 current, 10 previous for facility 10002  
│ 2017-09-15 │ 10001  │ 17/12 │ // 17 current, 12 previous for facility 10002  
└────────────┴────────┘───────┘

为了得到这个结果,我需要使用 JOIN,但正如我从文档中了解到的那样,JOIN 操作中的比较条件仅用于相等,不能使用不等式。

当然,我可以做一个笛卡尔乘积,然后在 WHERE 部分进行日期比较(获取所有小于当前日期的日期(,然后从过滤后的日期中选择最大日期。但这非常耗时和内存!

请帮我创建一个最佳查询,因为这种情况是 IOT 的典型情况。

///问题扩展////////////////////////////////////////////

我还想计算与上一时间每行的值相比的增量值(例如,1000 行中的每一行(。如果我想跟踪传感器跳跃,这种情况是典型的。例:

┌─date───────┬facility┬value┬  
│ 2017-09-01 │ 10002  │ 3   │  
│ 2017-09-02 │ 10001  │ 4   │  
│ 2017-09-04 │ 10002  │ 1   │  
│ 2017-09-05 │ 10001  │ 2   │  
│ 2017-09-09 │ 10002  │ 10  │  
│ 2017-09-10 │ 10001  │ 12  │  
│ 2017-09-12 │ 10002  │ 15  │  
│ 2017-09-15 │ 10001  │ 17  │  
└────────────┴────────┘─────┘
┌─date───────┬facility┬─value─┬  
│ 2017-09-04 │ 10002  │ 1/3   │ // 1 current,  3 previous for facility 10002  
│ 2017-09-05 │ 10001  │ 2/4   │ // 2 current,  4 previous for facility 10001  
│ 2017-09-09 │ 10002  │ 10/1  │ // 10 current, 1 previous for facility 10002  
│ 2017-09-10 │ 10001  │ 12/2  │ // 12 current, 2 previous for facility 10001  
│ 2017-09-12 │ 10002  │ 15/10 │ // 15 current, 10 previous for facility 10002  
│ 2017-09-15 │ 10001  │ 17/12 │ // 17 current, 12 previous for facility 10001  
└────────────┴────────┘───────┘

我的表具有以下结构:

CREATE TABLE SensorsLogs(
facility_id UInt64,
parameter_id UInt64,
timeSec DateTime('Asia/Yekaterinburg'),
value Float32 ) 
ENGINE = MergeTree()
PARTITION BY toYYYYMM(timeSec)
ORDER BY (facility_id, parameter_id, timeSec);
select groupArray(1)(date)[1], facility, groupArray(2)(value) from 
(select date,  facility, value from 
(select '2017-09-09' date, 10002 facility, 10 value union all
select '2017-09-10' , 10001, 12 union all
select '2017-09-12' , 10002, 15 union all
select '2017-09-15' , 10001, 17 ) 
order by facility, date desc)
group by facility
┌─arrayElement(groupArray(1)(date), 1)─┬─facility─┬─groupArray(2)(value)─┐
│ 2017-09-15                           │    10001 │ [17,12]              │
│ 2017-09-12                           │    10002 │ [15,10]              │
└──────────────────────────────────────┴──────────┴──────────────────────┘
SELECT 
toDate(max(time)) AS date, 
facility, 
argMax(value, time) AS current_value, 
argMin(value, time) AS previous_value,
toString(current_value) || '/' || toString(previous_value) AS values
FROM (
SELECT *
FROM (
/* emulate the original table */
SELECT toDateTime(test_set.1) as time, test_set.2 facility, test_set.3 value
FROM ( 
SELECT arrayJoin([
('2017-09-09 10:11:12', 10002, 10),
('2017-09-10 11:11:12', 10001, 12),
('2017-09-12 12:11:12', 10002, 15),
('2017-09-15 13:11:12', 10001, 17),
('2017-09-02 14:11:12', 10002,  5),
('2017-09-02 15:11:12', 10001,  7),
('2017-09-01 16:11:12', 10002,  7),
('2017-09-01 17:11:12', 10001,  9)]) test_set))
/*WHERE time BETWEEN .. AND ..*/
ORDER BY facility, time DESC
LIMIT 2 BY facility)
GROUP BY facility
ORDER BY date;
/* result
┌───────date─┬─facility─┬─current_value─┬─previous_value─┬─values─┐
│ 2017-09-12 │    10002 │            15 │             10 │ 15/10  │
│ 2017-09-15 │    10001 │            17 │             12 │ 17/12  │
└────────────┴──────────┴───────────────┴────────────────┴────────┘
*/

查询以查找每行的上一个值(将使用相邻窗口函数(:

SELECT time, facility, value, 
neighbor(facility, -1, -1) previous_facility,    
(facility != previous_facility ? -1 : neighbor(value, -1, -1)) previous_value, 
toString(value) || '/' || (previous_value = -1 ? 'none' : toString(previous_value)) AS values
FROM (
SELECT *
FROM (
/* emulate the original table */
SELECT toDateTime(test_set.1) as time, test_set.2 facility, test_set.3 value
FROM ( 
SELECT arrayJoin([
('2017-09-01 01:02:03', 10002, 3 ),  
('2017-09-02 01:02:03', 10001, 4 ),  
('2017-09-04 01:02:03', 10002, 1 ),  
('2017-09-05 01:02:03', 10001, 2 ),  
('2017-09-09 01:02:03', 10002, 10),  
('2017-09-10 01:02:03', 10001, 12),  
('2017-09-12 01:02:03', 10002, 15),  
('2017-09-15 01:02:03', 10001, 17)]) test_set))
/*WHERE time BETWEEN .. AND ..*/
ORDER BY facility, time)
ORDER BY time;
/* result:
┌────────────────time─┬─facility─┬─value─┬─previous_facility─┬─previous_value─┬─values─┐
│ 2017-09-01 01:02:03 │    10002 │     3 │             10001 │             -1 │ 3/none │
│ 2017-09-02 01:02:03 │    10001 │     4 │                -1 │             -1 │ 4/none │
│ 2017-09-04 01:02:03 │    10002 │     1 │             10002 │              3 │ 1/3    │
│ 2017-09-05 01:02:03 │    10001 │     2 │             10001 │              4 │ 2/4    │
│ 2017-09-09 01:02:03 │    10002 │    10 │             10002 │              1 │ 10/1   │
│ 2017-09-10 01:02:03 │    10001 │    12 │             10001 │              2 │ 12/2   │
│ 2017-09-12 01:02:03 │    10002 │    15 │             10002 │             10 │ 15/10  │
│ 2017-09-15 01:02:03 │    10001 │    17 │             10001 │             12 │ 17/12  │
└─────────────────────┴──────────┴───────┴───────────────────┴────────────────┴────────┘
*/

类似的查询,但使用运行差异:

SELECT time, facility, value, 
neighbor(facility, -1, -1) previous_facility,    
runningDifference(value) delta,
(facility != previous_facility ? -1 : value - delta) AS previous_value,
toString(value) || '/' || (previous_value = -1 ? 'none' : toString(previous_value)) AS values
FROM (
SELECT *
FROM (
/* emulate the original table */
SELECT toDateTime(test_set.1) as time, test_set.2 facility, test_set.3 value
FROM ( 
SELECT arrayJoin([
('2017-09-01 01:02:03', 10002, 3 ),  
('2017-09-02 01:02:03', 10001, 4 ),  
('2017-09-04 01:02:03', 10002, 1 ),  
('2017-09-05 01:02:03', 10001, 2 ),  
('2017-09-09 01:02:03', 10002, 10),  
('2017-09-10 01:02:03', 10001, 12),  
('2017-09-12 01:02:03', 10002, 15),  
('2017-09-15 01:02:03', 10001, 17)]) test_set))
/*WHERE time BETWEEN .. AND ..*/
ORDER BY facility, time)
ORDER BY time;
/*
┌────────────────time─┬─facility─┬─value─┬─previous_facility─┬─delta─┬─previous_value─┬─values─┐
│ 2017-09-01 01:02:03 │    10002 │     3 │             10001 │   -14 │             -1 │ 3/none │
│ 2017-09-02 01:02:03 │    10001 │     4 │                -1 │     0 │             -1 │ 4/none │
│ 2017-09-04 01:02:03 │    10002 │     1 │             10002 │    -2 │              3 │ 1/3    │
│ 2017-09-05 01:02:03 │    10001 │     2 │             10001 │    -2 │              4 │ 2/4    │
│ 2017-09-09 01:02:03 │    10002 │    10 │             10002 │     9 │              1 │ 10/1   │
│ 2017-09-10 01:02:03 │    10001 │    12 │             10001 │    10 │              2 │ 12/2   │
│ 2017-09-12 01:02:03 │    10002 │    15 │             10002 │     5 │             10 │ 15/10  │
│ 2017-09-15 01:02:03 │    10001 │    17 │             10001 │     5 │             12 │ 17/12  │
└─────────────────────┴──────────┴───────┴───────────────────┴───────┴────────────────┴────────┘
*/

(时间、设施(不太适合如上所述的计算。

如果它是主要用例,并且在所需的日期范围内运行得很慢,请考虑将密钥更改为(设施、时间(或使用数据跳过索引。

最新更新