我有一个包含数十亿条记录的表格,其中包含商品信息,如下所示。ID是自动递增的bigint。文章、UPC 和 MRP 是实际数据。数据日期具有诸如此MRP适用的日期之类的信息。
ID Article EANUPC MRP DataDate
8546417 20171554001 1220636 599 20/11/2015
18589213 20171554001 1220636 599 15/12/2017
18655485 20171554001 1220636 390 26/12/2017
18784953 20171554001 1220636 390 11/1/2018
18833697 20171554001 1220636 290 16/1/2018
18954190 20171554001 1220636 290 24/1/2018
19060047 20171554001 1220636 190 30/1/2018
19116702 20171554001 1220636 90 6/2/2018
20107113 20171554001 1220636 90 13/6/2018
20143100 20171554001 1220636 599 13/6/2018
我想以这样一种方式合并记录,如果 MRP 在随后的日子里是相同的。 那么它应该显示该物料需求计划的有效期自和有效期结束日期。
我需要以下输出。
Article EANUPC MRP FromDate ToDate
20171554001 1220636 599 20/11/2015 25/12/2017
20171554001 1220636 390 26/12/2017 15/01/2018
20171554001 1220636 290 16/1/2018 29/01/2018
20171554001 1220636 190 30/1/2018 05/02/2018
20171554001 1220636 90 6/2/2018 12/06/2018
20171554001 1220636 599 13/6/2018 24/08/2018
请帮助我实现这一目标。
您可以将其视为间隙和孤岛问题。用:
SELECT Article, EANUPC, MRP, DataDate,
ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
FROM mytable
ORDER BY DataDate
你会得到:
Article EANUPC MRP DataDate grp
---------------------------------------
20171554001 1220636 599 2015-11-20 0
20171554001 1220636 599 2017-12-15 0
20171554001 1220636 390 2017-12-26 2
20171554001 1220636 390 2018-01-11 2
20171554001 1220636 290 2018-01-16 4
20171554001 1220636 290 2018-01-24 4
20171554001 1220636 190 2018-01-30 6
20171554001 1220636 90 2018-02-06 7
20171554001 1220636 90 2018-06-13 7
20171554001 1220636 599 2018-06-13 7
现在,您可以将上述查询包装在CTE
中,并使用grp
字段执行分组:
;WITH CTE AS (
SELECT Article, EANUPC, MRP, DataDate,
ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
FROM mytable
)
SELECT Article, EANUPC, MRP, MIN(DataDate) AS FromDate
FROM CTE
GROUP BY Article, EANUPC, MRP, grp
输出:
Article EANUPC MRP FromDate
----------------------------------
20171554001 1220636 599 2015-11-20
20171554001 1220636 390 2017-12-26
20171554001 1220636 290 2018-01-16
20171554001 1220636 190 2018-01-30
20171554001 1220636 90 2018-02-06
20171554001 1220636 599 2018-06-13
您可以使用LEAD
获得最终结果:
;WITH CTE AS (
SELECT Article, EANUPC, MRP, DataDate,
ROW_NUMBER() OVER (PARTITION BY Article ORDER BY DataDate) -
ROW_NUMBER() OVER (PARTITION BY Article, MRP ORDER BY DataDate) AS grp
FROM mytable
), CTE2 AS (
SELECT Article, EANUPC, MRP, MIN(DataDate) AS FromDate
FROM CTE
GROUP BY Article, EANUPC, MRP,grp
)
SELECT Article, EANUPC, MRP, FromDate,
COALESCE(DATEADD(day, -1, LEAD(FromDate) OVER (ORDER BY FromDate)), FromDate) AS ToDate
FROM CTE2
输出:
Article EANUPC MRP FromDate ToDate
----------------------------------------------
20171554001 1220636 599 2015-11-20 2017-12-25
20171554001 1220636 390 2017-12-26 2018-01-15
20171554001 1220636 290 2018-01-16 2018-01-29
20171554001 1220636 190 2018-01-30 2018-02-05
20171554001 1220636 90 2018-02-06 2018-06-12
20171554001 1220636 599 2018-06-13 2018-06-13
您正在寻找 分组依据
SELECT
Article
, EANUPC
, MRP
, MIN(DataDate) as FromDate
, MAX(DataDate) as ToDate
FROM
[Your_Table]
GROUP BY
Article, EANUPC, MRP
您将需要创建一个新表并将这些记录插入其中,但它应该可以工作。 此外,如果它有数十亿条记录,您可能希望分块处理它,这样它就不会使您的服务器陷入困境。