我过去有以下代码,以便对所有客户执行所有"-"操作:
with
T1 as
(
select
[Contract] = 'Contract1',
[Customer] = 'Customer4',
[Date] = '2017-01-01',
[Action] = '+'
union all
select
[Contract] = 'Contract1',
[Customer] = 'Customer6',
[Date] = '2017-01-02',
[Action] = '+'
union all
select
[Contract] = 'Contract1',
[Customer] = 'Customer4',
[Date] = '2017-01-03',
[Action] = '-'
union all
select
[Contract] = 'Contract1',
[Customer] = 'Customer4',
[Date] = '2017-01-04',
[Action] = '+'
union all
select
[Contract] = 'Contract1',
[Customer] = 'Customer4',
[Date] = '2017-01-05',
[Action] = '-'
union all
select
[Contract] = 'Contract1',
[Customer] = 'Customer6',
[Date] = '2017-01-06',
[Action] = '-'
union all
select
[Contract] = 'Contract1',
[Customer] = 'Customer8',
[Date] = '2017-01-07',
[Action] = '+'
union all
select
[Contract] = 'Contract1',
[Customer] = 'Customer8',
[Date] = '2017-01-08',
[Action] = '-'
union all
select
[Contract] = 'Contract1',
[Customer] = 'Customer4',
[Date] = '2017-01-09',
[Action] = '+'
)
select
[Customer],
[Date]
from T1
where [Action] = '-'
现在我需要在合同字段上做这件事。这意味着当最后一次操作为"-"时,我必须返回Contract和Date值,而不是在该日期之前执行"+"操作的所有客户。非常期望的输出应该是:
Date | Contract
------------ | ------
2017-01-06 | Contract1
2017-01-08 | Contract1
预期的算法应该如下所示:
[PlusDC] = count(distinct iif([Action] = '+',Customer,NULL)) over (partition by [Contract] order by [Date])
[MinusDC] = count(distinct iif([Action] = '-',Customer,NULL)) over (partition by [Contract] order by [Date])
但是:
- 无论如何都不起作用
- 即使它有效,即使[PlusDC]=[MinusDC],它也会返回值2017-01-09,这是不正确的
粗略地说,我必须针对所有客户检查以下代码:
[Action]='-'用于当前行。
每个客户的lag([Action],1)='-'(如果客户记录出现在当天晚些时候,则为Null)。
更新:为了让事情变得更加清楚,我对我的数据进行了一个以列为导向的视图:
-----------------------------------------------------------------------
| Date | Contract | Customer4 | Customer6 | Customer8 | All |
| ------------ | --------- | --------- | --------- | --------- | --- |
| 2017-01-01 | Contract1 | + | | | |
| 2017-01-02 | Contract1 | | + | | |
| 2017-01-03 | Contract1 | | | | |
| 2017-01-04 | Contract1 | - | | | | <-- Customer6 still has a '+'
| 2017-01-05 | Contract1 | + | | | |
| 2017-01-06 | Contract1 | - | | | | <-- Customer6 still has a '+'
| 2017-01-07 | Contract1 | | - | | - | <-- All customers has '-' or null as a last action
| 2017-01-08 | Contract1 | | | + | |
| 2017-01-09 | Contract1 | | | - | - | <-- All customers has '-' or null as a last action
-----------------------------------------------------------------------
All列表示所有客户(我需要的行)的实际状态。正如您可能注意到的,2017-01-04和2017-01-06在合同字段中不是真实的"-"。Contract1未关闭,它仍有一个Customer6处于打开状态。当每个合同有一定数量的客户时,这很容易。那么无数呢?
有什么实用的建议吗?
好的,我将像您一样首先填写一个表来解决这个问题。我要做的是在每个日期重复客户和合同的每一个组合。
我把这个CTE附加到你的示例代码中:
,
FullTable as
(
select
a.[Contract]
,a.[Customer]
,b.[Date]
,c.[Action]
,count(c.[Action]) over (partition by a.[Contract],a.[Customer] order by b.[Date]) c
from
(select distinct
[Contract],
[Customer]
from T1) a
inner join
(select distinct
[Contract],
[Date]
from T1) b
on a.[Contract]=b.[Contract]
left join t1 c
on c.[Contract]=a.[Contract] and a.[Customer]=c.[Customer] and b.[Date]=c.[Date]
)
现在Fulltable做两件事,它确保每个客户每天都有一行。如果在源数据中没有针对该客户的操作,则action为NULL。我要做的第二件事是使用窗口计数来计数以前的操作次数
count(c.[Action]) over (partition by a.[Contract],a.[Customer] order by b.[Date]) c
计数不计算NULL值,因此这实际上是对数据进行分组,每个客户一组,每个日期都有一个值,并且任何直接在NULL操作之后的行都会获得相同的组
这是客户4 的数据
Contract Customer Date c Action
Contract1 Customer4 2017-01-01 1 +
Contract1 Customer4 2017-01-02 1 NULL
Contract1 Customer4 2017-01-03 2 -
Contract1 Customer4 2017-01-04 3 +
Contract1 Customer4 2017-01-05 4 -
Contract1 Customer4 2017-01-06 4 NULL
Contract1 Customer4 2017-01-07 4 NULL
Contract1 Customer4 2017-01-08 4 NULL
Contract1 Customer4 2017-01-09 5 +
现在我制作了一个新的CTE,名为DaillyStatus。这个CTE填充NULLS,这样每天都会保存该合同和客户的最新状态,而不是NULL。这意味着,对于表中的每一天,都可以找到每个客户合同组合的状态。要做到这一点,我只需为我刚刚找到的的每个组获取MAX
,DailyStatus as
(
select
[Contract]
,[Customer]
,[Date]
,[Action]
,c
,max([Action]) over (partition by [Contract],[Customer],c) FilledAction
from
FullTable
)
Contract Customer Date c FilledAction Action
Contract1 Customer6 2017-01-01 0 NULL NULL
Contract1 Customer6 2017-01-02 1 + +
Contract1 Customer6 2017-01-03 1 + NULL
Contract1 Customer6 2017-01-04 1 + NULL
Contract1 Customer6 2017-01-05 1 + NULL
Contract1 Customer6 2017-01-06 2 - -
Contract1 Customer6 2017-01-07 2 - NULL
Contract1 Customer6 2017-01-08 2 - NULL
Contract1 Customer6 2017-01-09 2 - NULL
使用此表,我们可以获得表中每个日期的每个客户的状态。由于'+'>'-'>NULL,我们可以找到所有客户都有'-'作为最新操作或在该日期或之前没有操作的日期(NULL)
select
[Contract]
,[Date]
,max(FilledAction)
from DailyStatus
group by [Contract],[Date]
having max(FilledAction) ='-'
完整的解决方案在这里:
,FullTable as
(
select
a.[Contract]
,a.[Customer]
,b.[Date]
,c.[Action]
,count(c.[Action]) over (partition by a.[Contract],a.[Customer] order by b.[Date]) c
from
(select distinct
[Contract],
[Customer]
from T1) a
inner join
(select distinct
[Contract],
[Date]
from T1) b
on a.[Contract]=b.[Contract]
left join t1 c
on c.[Contract]=a.[Contract] and a.[Customer]=c.[Customer] and b.[Date]=c.[Date]
)
,DailyStatus as
(
select
[Contract]
,[Customer]
,[Date]
,[Action]
,c
,max([Action]) over (partition by [Contract],[Customer],c) FilledAction
from
FullTable
)
select
[Contract]
,[Date]
,max(FilledAction)
from DailyStatus
group by [Contract],[Date]
having max(FilledAction) ='-'
我认为您可以这样使用ROW_NUMBER()
:
;with tt as (
select T1.[Contract], T1.[Date], T1.[Action], t.[Customer], t.[Action] lAction, t.[Date] lDate
-- this `rn` will give me the last action for each other customer older that each Date
, row_number() over (partition by T1.[Contract], T1.[Date], t.[Customer] order by t.[Date] desc) rn
from T1
-- I use this self left join to gather data with:
left join T1 t
on T1.[Contract] = t.[Contract] -- same Contract
and T1.[Date] > t.[Date] -- older than current date
and T1.[Customer] != t.[Customer] -- for other customers
-- So I will have actions of other customer older than each date
)
select [Contract], [Date]
from T1
-- I just check if there is not any data in `tt` with:
where not exists(
select 1
from tt
where tt.[Contract] = T1.[Contract] -- same contract
and tt.[Date] = T1.[Date] -- same date
and rn = 1 -- only last action
and (T1.[Action] = '+' -- current customer's action is '+'
or isnull(lAction, '+') = '+') -- or others last actions is '+'
)
group by [Contract], [Date];
多亏了这些答案,我想出了自己的解决方案,它和Søren的一样正确,也和shA.t的一样快。谢谢大家!
,[SelfJoinSet] as (
select
T1.[Contract], -- Contract from the base
T1.[Date], -- Date from the base
T2.[Customer], -- Customers from the current contract
T2.[Action], -- ActionType of the customer
-- RowNumber in order to get the last record for each customer within the current contract:
[RN] = row_number() over (partition by T2.[Customer],T1.[Date] order by T2.[Date] desc)
from T1
left join T1 as T2 -- SelfJoin
on T1.[Contract] = T2.[Contract] -- Within the current contract
and T1.[Date] >= T2.[Date] -- Get all records till the current date
where T1.[Action] = '-' --Filter out all '+' actions, we'll get only '-' records anyway (for performance optimization reasons).
)
select
[Contract],
[Date]
from [SelfJoinSet]
where [RN] = 1 --Show only last record per each customer
group by [Contract],[Date] -- Collapse all to the base table records
having count(distinct iif([Action] = '-',[Customer],NULL)) = count(distinct [Customer]) -- Leave only records where the last action is '-' for all customers