如何按值分组以识别最小和最大日期,但按特定顺序?



我试图在这里找到一些东西,但我没有找到我的用例。我希望你能帮助我。首先我的表是可用的:

tbody> <<tr>
STATION_NUMBER PART_NO BOOK_DATE
111112021-08-01 6:00:00
111112021-08-01 6:05:00
111112021-08-01 6:07:00
111112021-08-01 6:08:00
11111B2021-08-01 7:10:00
11111B2021-08-01 7:13:00
11111B2021-08-01 7:15:00
11111B2021-08-01 7:25:00
111112021-08-01 8:10:00
111112021-08-01 8:12:00
111112021-08-01 8:16:00
111112021-08-01 8:19:00
222222021-08-01 6:00:00
222222021-08-01 6:05:00
222222021-08-01 6:07:00
222222021-08-01 6:08:00
22222B2021-08-01 7:10:00
22222B2021-08-01 7:13:00
22222B2021-08-01 7:15:00
22222B2021-08-01 7:25:00
222222021-08-01 8:10:00
222222021-08-01 8:12:00
222222021-08-01 8:16:00
222222021-08-01 8:19:00

你的问题属于"缺口和孤岛"问题。(如果你想进一步研究,谷歌搜索那个短语)。

在Oracle 11和更早的版本中,您可以使用解析函数来获得所需的结果。这种方法被称为"tabibitosan法"。或"固定差异法">

首先是关键步骤(在下面的with子句中的子查询中):仅按工位号分组(划分)计算序列号,并分别按工位和零件号划分。在零件号相同的连续行序列中,差值是恒定的,然后当新的这样的序列开始时,差值跳转到不同的值。然后在外部查询中使用它进行分组。

with 
prep as (
select pd.*,
row_number() over (partition by station_number order by book_date)
- row_number() over (partition by station_number, part_no
order by book_date) as grp
from   prod_date pd
)
select station_number, part_no, min(book_date) as start_book_date,
max(book_date) as end_book_date
from   prep
group  by station_number, part_no, grp
order  by station_number, start_book_date
;
STATION_NUMBER PART_NO START_BOOK_DATE     END_BOOK_DATE      
-------------- ------- ------------------- -------------------
11111 A       2021-08-01 06:00:00 2021-08-01 06:08:00
11111 B       2021-08-01 07:10:00 2021-08-01 07:25:00
11111 A       2021-08-01 08:10:00 2021-08-01 08:19:00
22222 A       2021-08-01 06:00:00 2021-08-01 06:08:00
22222 B       2021-08-01 07:10:00 2021-08-01 07:25:00
22222 A       2021-08-01 08:10:00 2021-08-01 08:19:00

从Oracle 12,这是MATCH_RECOGNIZE的作用:

SELECT *
FROM   prod_date
MATCH_RECOGNIZE(
PARTITION BY station_number
ORDER     BY book_date
MEASURES
FIRST(part_no) AS part_no,
FIRST(book_date) AS start_book_date,
LAST(book_date) AS end_book_date
ONE ROW PER MATCH
PATTERN (same_part+)
DEFINE
same_part AS FIRST(part_no) = part_no
)

对于样本数据:

CREATE TABLE prod_date (STATION_NUMBER, PART_NO, BOOK_DATE) AS
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '6:00:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '6:05:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '6:07:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '6:08:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'B',    DATE '2021-08-01' + INTERVAL '7:10:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'B',    DATE '2021-08-01' + INTERVAL '7:13:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'B',    DATE '2021-08-01' + INTERVAL '7:15:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'B',    DATE '2021-08-01' + INTERVAL '7:25:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '8:10:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '8:12:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '8:16:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 11111,   'A',    DATE '2021-08-01' + INTERVAL '8:19:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '6:00:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '6:05:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '6:07:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '6:08:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'B',    DATE '2021-08-01' + INTERVAL '7:10:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'B',    DATE '2021-08-01' + INTERVAL '7:13:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'B',    DATE '2021-08-01' + INTERVAL '7:15:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'B',    DATE '2021-08-01' + INTERVAL '7:25:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '8:10:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '8:12:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '8:16:00' HOUR TO SECOND FROM DUAL UNION ALL
SELECT 22222,   'A',    DATE '2021-08-01' + INTERVAL '8:19:00' HOUR TO SECOND FROM DUAL;

输出:

<表类>STATION_NUMBERPART_NOSTART_BOOK_DATEEND_BOOK_DATEtbody><<tr>111112021-08-01 06:00:002021-08-01 06:08:0011111B2021-08-01 07:10:002021-08-01 07:25:00111112021-08-01 08:10:002021-08-01 08:19:00222222021-08-01 06:00:002021-08-01 06:08:0022222B2021-08-01 07:10:002021-08-01 07:25:00222222021-08-01 08:10:002021-08-01 08:19:00
  • 在内联视图t中,每个子组的结束行被标记在gaps列中。
  • 在内联视图tt中,使用first_value分析函数填充所有来自内联视图t的gap列中空值的行。
  • 最后,我将内联视图tt中的行按STATION_NUMBER, PART_NO, GRP列分组,然后使用min和max聚合函数来获得所需的输出。
select STATION_NUMBER, PART_NO, min(BOOK_DATE) START_BOOK_DATE, max(BOOK_DATE) END_BOOK_DATE
from (
select STATION_NUMBER, PART_NO, BOOK_DATE, GAPS
, FIRST_VALUE(GAPS ignore nulls) 
over( partition by STATION_NUMBER order by BOOK_DATE 
ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING ) grp
from (
select 
STATION_NUMBER, PART_NO, BOOK_DATE
, case 
when PART_NO != lead(PART_NO, 1, '-'||PART_NO)over(partition by STATION_NUMBER order by BOOK_DATE)
then row_number()over(partition by STATION_NUMBER order by BOOK_DATE)
else null
end gaps
from PROD_DATA
)t
)tt
group by STATION_NUMBER, PART_NO, GRP
order by STATION_NUMBER, GRP
;

演示

最新更新