发现连续值的范围



使用带有时间戳的MySQL表

|Timestamps             |
|-----------------------|
|2021-08-01 14:00:00.000|
|2021-08-01 14:00:00.100|
|2021-08-01 14:00:00.200|
|2021-08-01 14:00:00.300|
|2021-08-01 14:00:00.600|
|2021-08-01 14:00:00.700|
|2021-08-01 14:00:00.800|
|2021-08-01 14:00:01.000|

我想得到连续数据的时间间隔。连续是由频率来定义的,在这个例子中,它可以是10Hz期望的结果将是

|Start                   | End                    |
|------------------------|------------------------|
|2021-08-01 14:00:00.000 | 2021-08-01 14:00:00.300|
|2021-08-01 14:00:00.600 | 2021-08-01 14:00:00.800|

我使用MySQL版本5.7.35不能使用WITH等功能。这可以快速完成吗?目前每个表大约有100000个元素。

使用说明:

CREATE TABLE test_tbl (
my_data timestamp(3)  
);
INSERT INTO test_tbl VALUES
('2021-08-01 14:00:00.000'),
('2021-08-01 14:00:00.100'),
('2021-08-01 14:00:00.200'),
('2021-08-01 14:00:00.300'),
('2021-08-01 14:00:00.600'),
('2021-08-01 14:00:00.700'),
('2021-08-01 14:00:00.800'),
('2021-08-01 14:00:01.000');

SELECT 
MIN(my_data) start,
MAX(my_data) end 
FROM 
( SELECT *
, CASE WHEN right(my_data,3) = @prev+100 THEN @i:=@i ELSE @i:=@i+100 END row_num
, @prev:=right(my_data,3) prev
FROM test_tbl
, (SELECT @prev:= null,@i:=0) vars 
ORDER BY row_num
) x 
GROUP  BY row_num;

演示:https://www.db-fiddle.com/f/7yUJcuMJPncBBnrExKbzYz/98

得到第一个答案后,我开始修改它,昨晚我得到了一个工作样例。

SELECT 
MIN(my_data) start,
MAX(my_data) end 
FROM 
( SELECT *
, CASE WHEN UNIX_TIMESTAMP(my_data)*1000 >= @prev+90 and 
UNIX_TIMESTAMP(my_data)*1000 <= @prev+110
THEN @i:=@i ELSE @i:=@i+100 END row_num
, @prev:=UNIX_TIMESTAMP(my_data)*1000 prev
FROM test_tbl
, (SELECT @prev:= null,@i:=0) vars 
ORDER BY row_num
) x 
GROUP  BY row_num;


我将标记第一个答案,因为它给了灵感。谢谢你的建议。

插入速度

经验法则——"如果你每秒写的次数少于100次,不要担心性能"。(你问的是10秒,对吗?)(有一些技巧可以超过100秒。)

<<p>Non-windowing解决方案/strong>(8.0或10.2版本)

—样本数据

CREATE TABLE gaps (
ts timestamp(3),
PRIMARY KEY(ts)    -- Some index on ts is important here
);
INSERT INTO gaps VALUES
('2021-08-01 14:00:00.000'),
('2021-08-01 14:00:00.100'),
('2021-08-01 14:00:00.200'),
('2021-08-01 14:00:00.300'),
('2021-08-01 14:00:00.600'),
('2021-08-01 14:00:00.700'),
('2021-08-01 14:00:00.800'),
('2021-08-01 14:00:01.000'),  -- By itself
('2021-08-01 14:00:01.300');

——添加一个序列号(通过AUTO_INCREMENT);注意:这取决于auto_increment_increment = 1

DROP TEMPORARY TABLE IF EXISTS gaps2;
CREATE TEMPORARY TABLE gaps2 (
id INT AUTO_INCREMENT,
PRIMARY KEY(id) )
SELECT  val
FROM (
( SELECT a.ts as val
FROM gaps AS a
LEFT JOIN gaps AS b  ON a.ts = b.ts + interval 0.1 second
WHERE b.ts IS NULL
)
UNION ALL
( SELECT  b.ts as val
FROM gaps AS b
LEFT JOIN gaps AS a  ON b.ts = a.ts - interval 0.1 second
WHERE a.ts IS NULL
)
ORDER BY val
) AS cd;

select * from gaps2;
+----+-------------------------+
| id | val                     |
+----+-------------------------+
|  1 | 2021-08-01 14:00:00.000 |
|  2 | 2021-08-01 14:00:00.300 |
|  3 | 2021-08-01 14:00:00.600 |
|  4 | 2021-08-01 14:00:00.800 |
|  5 | 2021-08-01 14:00:01.000 |
|  6 | 2021-08-01 14:00:01.000 |
|  7 | 2021-08-01 14:00:01.300 |
|  8 | 2021-08-01 14:00:01.300 |
+----+-------------------------+

——主:

SELECT  CEIL(id/2) AS pair,
MIN(val) AS 'run_start',
MAX(val) AS 'run_end'
FROM gaps2
GROUP BY pair
ORDER BY pair;
+------+-------------------------+-------------------------+
| pair | run_start               | run_end                 |
+------+-------------------------+-------------------------+
|    1 | 2021-08-01 14:00:00.000 | 2021-08-01 14:00:00.300 |
|    2 | 2021-08-01 14:00:00.600 | 2021-08-01 14:00:00.800 |
|    3 | 2021-08-01 14:00:01.000 | 2021-08-01 14:00:01.000 |
|    4 | 2021-08-01 14:00:01.300 | 2021-08-01 14:00:01.300 |
+------+-------------------------+-------------------------+

相关内容

  • 没有找到相关文章

最新更新