在BigQuery中,我试图编写一个查询,计算每个ID的开始和结束日期之间的距离。我对这些日期范围之间的点不感兴趣。我只需要获取开始和结束日期的纬度/经度,并计算欧几里得距离。
表1如下所示:
ID startdate enddate
A 2016-9-16 2016-10-9
A 2017-3-18 2017-4-9
表2的坐标如下:
ID Date Latitude Longitude
A 2016-9-16 40.76 -109.33
A 2016-9-17 40.72 -109.33
A 2016-10-9 40.75 -109.33
A 2017-3-18 40.81 -109.33
A 2017-4-8 40.83 -109.33
A 2017-4-9 40.96 -109.32
我想要的结果看起来像:
ID t1.startdate t2.Latitude t2.Longitude t1.enddate t2.Latitude t2.Longitude distance
A 2017-3-18 40.81 -109.33 2017-4-9 40.96 -109.32 150
A 2016-9-16 40.76 -109.33 2016-10-9 40.75 -109.33 200
这是我得到的最接近的,但由于可能有多个ID,它与每个ID的正确开始和结束日期不匹配:
SELECT
t1,t2,
ST_DISTANCE(Point1, Point2 ) as distance
from (
SELECT
ID,
st_geogpoint(Longitude,Latitude) as Point1
from `t2` AS t2 INNER JOIN
`t1` AS t1 ON t1.ID = t2.ID
WHERE t1.ID = t2.ID AND
t2.Date = t1.startdate ) t1,
(
SELECT
ID,
st_geogpoint(Longitude, Latitude) as Point2
from `t2` AS t2 INNER JOIN
`t1` AS t1 ON t1.ID = t2.ID
WHERE t1.ID = t2.ID AND
t2.Date = t1.enddate
) t2
WHERE t1.ID = t2.ID
当前结果如下:
ID t1.startdate t2.Latitude t2.Longitude t1.enddate t2.Latitude t2.Longitude distance
A 2016-9-16 40.76 -109.33 2016-10-9 40.75 -109.33 150
A 2016-9-16 40.76 -109.33 2017-4-9 40.96 -109.32 250
A 2017-3-18 40.81 -109.33 2017-4-9 40.96 -109.32 200
A 2017-3-18 40.81 -109.33 2016-10-9 40.75 -109.33 250
因此,这可能会解决您的问题:
我使用t1作为源,在日期上使用JOIN
t2两次(开始和结束(,然后引用ST_DISTANCE
中的那些。也许有一种更快/更高效的方法来解决这个问题。
WITH t1 as (
SELECT 'A' AS ID, '2016-9-16'AS startdate, '2016-10-9' AS enddate
UNION ALL SELECT 'A','2017-3-18', '2017-4-9'
), t2 as (
SELECT 'A' AS ID,'2016-9-16' AS date, 40.76 AS Latitude, -109.33 AS Longitude
UNION ALL SELECT 'A','2016-9-17',40.72,-109.33
UNION ALL SELECT 'A','2016-10-9',40.75,-109.33
UNION ALL SELECT 'A','2017-3-18',40.81,-109.33
UNION ALL SELECT 'A','2017-4-8',40.83,-109.33
UNION ALL SELECT 'A','2017-4-9',40.96,-109.32
)
SELECT
t1.ID,
start_point.date AS start_date,
start_point.Longitude AS start_long,
start_point.Latitude AS start_lat,
end_point.date AS end_date,
end_point.Longitude AS end_long,
end_point.Latitude AS end_lat,
ST_DISTANCE(
ST_GEOGPOINT(start_point.Longitude,start_point.Latitude),
ST_GEOGPOINT(end_point.Longitude,end_point.Latitude)) AS distance
FROM t1
JOIN t2 AS start_point
ON t1.id = start_point.id AND t1.startdate = start_point.date
JOIN t2 AS end_point
ON t1.id = end_point.id AND t1.enddate = end_point.date
结果:
Row | ID | start_date | start_longend_date | end_longend_lat | 距离
---|---|---|---|---|
1 | A | 2016-9-16 | -109.3340.76 | 2016-10-9<109.33>>40.75 | 1111.9510117740244
2 | A | 2017-3-18<109.33>40.81 | 2017-4-9 | -109.32>40.9616700.437093959285