将同一公司的同一行客户确定为连续日期,并在SQL中计算持续时间和总支出



我试图识别连续日期(两个连续记录之间只有1天的差距)由客户在同一家公司作为一个组,并计算持续时间和总支出。

下面是我可以用SQL查询生成的原始数据输出:

create table tbl
(
Company   char,
Line      char(2),
Customer  varchar(5),
StartDate date,
Spending  decimal(10,2)
);
insert into tbl values 
('A',       's1',      'Tom',         '20210202',    10.00),
('A',       's1',      'Tom',         '20210201',    10.00),
('A',       's1',      'Tom',         '20210203',    10.00),
('A',       's2',      'Tom',         '20210204',    10.00),
('A',       's2',      'Tom',         '20210206',    10.00),
('B',       's1',      'Tom',         '20210201',    15.00),
('A',       's3',      'Tom',         '20210207',    10.00),
('A',       's3',      'Ken',         '20210207',    10.00),
('C',       's1',      'Tom',         '20210201',    20.00);
我使用的SQL代码:
with
cte as
(
select *, 
g = case when Line = lag(Line) over (partition by Company, Customer 
order by StartDate)
then 0
else 1
end
from   tbl
),
cte2 as
(
select *, grp = sum(g) over (partition by Company, Customer 
order by StartDate)
from   cte
)
select Company, Line, Customer,
StartDate = min(StartDate),
EndDate = max(StartDate),
Duration = datediff(day, min(StartDate), max(StartDate)) + 1,
TotalSpending = sum(spending)
from   cte2
group by Company, Customer, Line, grp
order by Company, Customer, grp

下面是我期望的输出:

Company Line    Customer    StartDate   EndDate    Duration TotalSpending
A       s1      Tom         2021-02-01  2021-02-03  3   30.00
A       s2      Tom         2021-02-04  2021-02-04  1   10.00
A       s2      Tom         2021-02-06  2021-02-06  1   10.00
A       s3      Tom         2021-02-07  2021-02-07  1   10.00
B       s1      Tom         2021-02-01  2021-02-01  1   15.00
C       s1      Tom         2021-02-01  2021-02-01  1   20.00
A       s3      Ken         2021-02-07  2021-02-07  1   10.00

标准是:

  1. "EndDate"是最后连续startDate可以(1天差距/间隔两个连续记录)在同一行,同一家公司给客户。

  2. 时间= (Enddate - startdate可以)+ 1

  3. 总支出是具有相同行和相同公司的组的总和,给定客户和连续的开始日期。

  4. 如果同一客户在同一日期和同一公司有多个条目,这是由于重复应该被删除。

提前感谢!

基本上,您想要按客户进行分组,并使用同一行作为分组,然后计算持续时间和总支出。

第一个cte使用铅()找到Line当有变化。

第二个cte计算累积的sum()形成group

最后的查询只是做一个GROUP BY和计算所需的Duration&Total Spending

with
cte as
(
select *, 
g = case when Line = lead(Line) over (partition by Company, Customer 
order by StartDate)
then 1
else 0
end
from   tbl
),
cte2 as
(
select *, grp = sum(g) over (partition by Company, Customer 
order by StartDate)
from   cte
)
select Company, Line, Customer,
StartDate = min(StartDate),
EndDate   = max(StartDate),
Duration  = datediff(day, min(StartDate), max(StartDate)) + 1,
TotalSpending = sum(Spending)
from   cte2
group by Company, Line, Customer, grp
order by Company, Line, Customer

,db&lt的在小提琴演示

最新更新