我在SQL Server中有一个时间表,其中包含公共交通工具的[SERV_ID]
(服务ID),[STATION]
(车站),[ARR]
(到达时间),[DEP]
(出发时间)。每个服务都可以每天出现[SERV_DAY]
。
目标是汇总服务日、服务行、第一站、最后一站和相应的时间戳。 --> 每天每个服务一行。
对于[SERV_ID]
N170,这将是:
SERV_DAY SERV_ID FIRST_STATION MIN_DEP LAST_STATION MAX_ARR
2019-08-14 00:00:00 N170 Downtown 2019-08-14 06:06:00 CentralStation 2019-08-14 07:11:00
我试图通过([SERV_DAY], [SERV_ID])
然后为每个分区获取MAX([ARR])
和MIN([DEP])
来做到这一点。这工作了很长时间,但现在我想为每个最小值和最大值获取相应的站。
SELECT
[SERV_DAY],[SERV_ID],
MAX([ARR]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MAX_ARR,
MIN([DEP]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MIN_DEP
FROM #demo
稍后我需要在最后一个站点添加延迟,这在数据集的扩展版本中可用为[ARR_EFFECTIVE]
和[DEP_EFFECTIVE]
。希望一旦我知道如何总结上述每日线条,我就能添加这些属性。
这个话题很接近,但我不知道如何适应"差距和岛屿问题" Min() 和 Max() 基于 sql 服务器中的分区
我已经在dbfiddle中设置了一个演示数据集 https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=52e53d43a49ddb8f67454e576bfa7d74
谁能帮我完成查询?
SELECT
[SERV_DAY]
,[SERV_ID],
FIRST_VALUE(STATION) over (Partition by [SERV_DAY],[SERV_ID] Order by ARR DESC) Station1
, FIRST_VALUE(STATION) over (Partition by [SERV_DAY],[SERV_ID] Order by DEP ASC) Station2
FROM #demo
如果您有大量数据,我想我会使用临时表而不是 CTE,但这里有一个关于它应该如何工作的快速想法:
WITH CTE AS
(
SELECT *
, ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY ARR ) RN
, ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY DEP ) RN2
from #demo
)
SELECT t1.[SERV_DAY],t1.[SERV_ID],t1.[STATION] FIRST_STATION, t1.[DEP] MIN_DEP, t2.STATION LAST_STATION
FROM CTE t1
INNER JOIN CTE t2 on t1.SERV_DAY = t2.SERV_DAY and t1.SERV_ID = t2.SERV_ID and t2.RN2 = 1
WHERE t1.RN = 1
您可以通过两个步骤执行此操作: 首先添加一个按ARR
降序排序的row_number,并添加另一个按dep
排序的row_number。然后,您可以过滤row_number
= 1 的行,以选择其他列。 下面是如何检索max_arr和min_dep站的示例:
WITH T AS (
SELECT
[SERV_DAY], [SERV_ID],
MAX([ARR]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MAX_ARR,
MIN([DEP]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MIN_DEP,
ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY [ARR] DESC) AS RN_ARR,
ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY [DEP]) AS RN_DEP,
*
FROM #demo
)
SELECT MAX(CASE WHEN RN_ARR = 1 THEN [STATION] END) MAX_ARR_STATION,
MAX(CASE WHEN RN_DEP = 1 THEN [STATION] END) MIN_DEP_STATION,
*
FROM T
作为对@casenonsensitive的回复,它使用他的代码和一些修改就可以工作!
WITH T AS (
SELECT
[SERV_DAY], [SERV_ID], [STATION],
MAX([ARR]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MAX_ARR,
MIN([DEP]) OVER(PARTITION BY [SERV_DAY],[SERV_ID]) AS MIN_DEP,
ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY [ARR] ) AS RN_ARR,
ROW_NUMBER() OVER(PARTITION BY [SERV_DAY],[SERV_ID] ORDER BY [DEP] ) AS RN_DEP
FROM #demo
)
SELECT MAX(CASE WHEN RN_ARR = 1 THEN [STATION] END) MIN_DEP_STATION,
MAX(CASE WHEN RN_DEP = 1 THEN [STATION] END) MAX_ARR_STATION, [SERV_DAY], [SERV_ID], MAX_ARR, MIN_DEP from T
group by [SERV_DAY], [SERV_ID], MIN_DEP, MAX_ARR