我试图识别连续日期(两个连续记录之间只有1天的差距)由客户在同一家公司作为一个组,并计算持续时间和总支出。
下面是我可以用SQL查询生成的原始数据输出:
create table tbl
(
Company char,
Line char(2),
Customer varchar(5),
StartDate date,
Spending decimal(10,2)
);
insert into tbl values
('A', 's1', 'Tom', '20210202', 10.00),
('A', 's1', 'Tom', '20210201', 10.00),
('A', 's1', 'Tom', '20210203', 10.00),
('A', 's2', 'Tom', '20210204', 10.00),
('A', 's2', 'Tom', '20210206', 10.00),
('B', 's1', 'Tom', '20210201', 15.00),
('A', 's3', 'Tom', '20210207', 10.00),
('A', 's3', 'Ken', '20210207', 10.00),
('C', 's1', 'Tom', '20210201', 20.00);
我使用的SQL代码:
with
cte as
(
select *,
g = case when Line = lag(Line) over (partition by Company, Customer
order by StartDate)
then 0
else 1
end
from tbl
),
cte2 as
(
select *, grp = sum(g) over (partition by Company, Customer
order by StartDate)
from cte
)
select Company, Line, Customer,
StartDate = min(StartDate),
EndDate = max(StartDate),
Duration = datediff(day, min(StartDate), max(StartDate)) + 1,
TotalSpending = sum(spending)
from cte2
group by Company, Customer, Line, grp
order by Company, Customer, grp
下面是我期望的输出:
Company Line Customer StartDate EndDate Duration TotalSpending
A s1 Tom 2021-02-01 2021-02-03 3 30.00
A s2 Tom 2021-02-04 2021-02-04 1 10.00
A s2 Tom 2021-02-06 2021-02-06 1 10.00
A s3 Tom 2021-02-07 2021-02-07 1 10.00
B s1 Tom 2021-02-01 2021-02-01 1 15.00
C s1 Tom 2021-02-01 2021-02-01 1 20.00
A s3 Ken 2021-02-07 2021-02-07 1 10.00
标准是:
"EndDate"是最后连续startDate可以(1天差距/间隔两个连续记录)在同一行,同一家公司给客户。
时间= (Enddate - startdate可以)+ 1
总支出是具有相同行和相同公司的组的总和,给定客户和连续的开始日期。
如果同一客户在同一日期和同一公司有多个条目,这是由于重复应该被删除。
提前感谢!
基本上,您想要按客户进行分组,并使用同一行作为分组,然后计算持续时间和总支出。
第一个cte使用铅()找到Line
当有变化。
第二个cte计算累积的sum()
形成group
最后的查询只是做一个GROUP BY
和计算所需的Duration
&Total Spending
with
cte as
(
select *,
g = case when Line = lead(Line) over (partition by Company, Customer
order by StartDate)
then 1
else 0
end
from tbl
),
cte2 as
(
select *, grp = sum(g) over (partition by Company, Customer
order by StartDate)
from cte
)
select Company, Line, Customer,
StartDate = min(StartDate),
EndDate = max(StartDate),
Duration = datediff(day, min(StartDate), max(StartDate)) + 1,
TotalSpending = sum(Spending)
from cte2
group by Company, Line, Customer, grp
order by Company, Line, Customer
,db<的在小提琴演示