SQL 在时间戳上联接两个传感器数据



我有两个表,它们在不同时间捕获传感器数据,需要帮助编写按时连接这些传感器数据的 sql 查询。

sensor1数据(为了简化起见,我每 5 分钟制作一次,但不一定(:

sensor1_time      |  sensor1_value
09:55am           |  A
10:00am           |  B
10:05am           |  C
10:10am           |  D
10:15am           |  E
10:20am           |  F

sensor2数据:

sensor2_time      |  Sensor2_value
9:50am            |  X
10:05am           |  Y
10:19am           |  Z

我想加入sensor1sensor2表和输出应如下所示:

sensor1_time      |  sensor2_time   |     Sensor1_value  |  Sensor2_value
09:55am           |  9:50am         |     A              |  X
10:00am           |  9:50 am        |     B              |  X
10:05am           |  10:03am        |     C              |  Y
10:10am           |  10:05am        |     D              |  Y
10:15am           |  10:05am        |     E              |  Y
10:20am           |  10:19am        |     F              |  Z

您可以JOIN两个表并使用NOT EXISTS条件在sensor2中选择相关记录,如下所示:

SELECT 
s1.sensor1_time, 
s2.sensor2_time,
s1.sensor1_value,
s2.sensor2_value
FROM sensor1 s1
INNER JOIN sensor2 s2
ON  s1.sensor1_time >= s2.sensor2_time
AND NOT EXISTS (
SELECT 1 
FROM sensor2 s20
WHERE 
s1.sensor1_time >= s20.sensor2_time
AND s20.sensor2_time > s2.sensor2_time
)

这个带有示例数据的数据库小提琴演示返回:

sensor1_time | sensor2_time | sensor1_value | sensor2_value
:----------- | :----------- | :------------ | :------------
09:55        | 09:50        | A             | X            
10:00        | 09:50        | B             | X            
10:05        | 10:05        | C             | Y            
10:10        | 10:05        | D             | Y            
10:15        | 10:05        | E             | Y            
10:20        | 10:19        | F             | Z     

下面是 BigQuery Standard SQL

#standardSQL
SELECT
s1.sensor1_time, 
s2.sensor2_time,
s1.sensor1_value,
s2.sensor2_value
FROM sensor1 s1
JOIN (
SELECT sensor2_time, 
LEAD(sensor2_time) OVER(ORDER BY sensor2_time) AS sensor2_time2,
sensor2_value
FROM sensor2
) s2
ON s1.sensor1_time >= sensor2_time AND IFNULL(s1.sensor1_time < sensor2_time2, TRUE)     

上面假设您的sensor_time字段是问题标题中所述的时间戳数据类型

注意:如果将它们视为字符串,09:55am被视为大于09:50pm,这意味着您需要将它们转换为/解析时间

因此,如果实际上这些字段是按照您提供的示例格式化的字符串 - 您首先需要将它们强制转换/解析为正确的数据类型 - 如下面的简化示例所示:

#standardSQL
SELECT
s1.sensor1_time, 
s2.sensor2_time,
s1.sensor1_value,
s2.sensor2_value
FROM sensor1 s1
JOIN (
SELECT sensor2_time, 
LEAD(sensor2_time) OVER(ORDER BY PARSE_TIME('%I:%M%p', sensor2_time)) AS sensor2_time2,
sensor2_value
FROM sensor2
) s2
ON PARSE_TIME('%I:%M%p', s1.sensor1_time) >= PARSE_TIME('%I:%M%p', sensor2_time) 
AND IFNULL(PARSE_TIME('%I:%M%p', s1.sensor1_time) < PARSE_TIME('%I:%M%p', sensor2_time2), TRUE)    

如果稍后应用于您问题中的样本数据

WITH `sensor1` AS (
SELECT '09:55am' sensor1_time, 'A' sensor1_value UNION ALL
SELECT '10:00am', 'B' UNION ALL
SELECT '10:05am', 'C' UNION ALL
SELECT '10:10am', 'D' UNION ALL
SELECT '10:15am', 'E' UNION ALL
SELECT '10:20am', 'F' 
), `sensor2` AS (
SELECT '09:50am' sensor2_time, 'X' sensor2_value UNION ALL
SELECT '10:05am', 'Y' UNION ALL
SELECT '10:19am', 'Z' 
)

结果是

Row sensor1_time    sensor2_time    sensor1_value   sensor2_value    
1   09:55am         09:50am         A               X    
2   10:00am         09:50am         B               X    
3   10:05am         10:05am         C               Y    
4   10:10am         10:05am         D               Y    
5   10:15am         10:05am         E               Y    
6   10:20am         10:19am         F               Z      

最新更新