对重复的 30 天内存在的记录进行计数



我需要根据起点@StartDate来计算 30 天内存在的呼叫。例如,假设我有一个包含电话#,datetime和RN的表,这是按数字分区并按日期编号排序的ROW_NUMBER函数。然后,假设我有@StartDate = '1/1/2018'.

因此,这 30 天计数的开始将从第一行开始,2019-11-21 13:15:27.759.在 30 天内,此计算中将包括前 3 行。接下来,第 4 行到第 6 行将分组在一起。然后是 7 到 11。如果向下移动到第 12 行,则从第 15 行到该行将分组在一起,因为它们在 30 天内。如果我们只看这些行到第 15 行,我希望看到这个电话号码的计数为 4。

我试图将ROW_NUMBER与规定DATEDIFF(DAY,a.InitiatedDateTimeUTC, u.InitiatedDateTimeUTC) BETWEEN 0 and 30)的 WHERE 子句结合使用,在自连接的上下文中,但我的数字就是不对。获取这些数据的最佳方式是什么?这是我收到的一个奇怪的请求,我以前从未以这种方式聚合数据,所以它有点难倒了我。

create table #data (phone varchar(20), InitiatedDateTimeUTC datetime, rn tinyint)
insert into #data values
('555-555-5555',    '2019-11-21 13:15:27.759',  '1'),
('555-555-5555',    '2019-11-21 13:13:48.121',  '2'),
('555-555-5555',    '2019-11-11 19:46:38.511',  '3'),
('555-555-5555',    '2018-08-29 13:48:26.796',  '4'),
('555-555-5555',    '2018-08-10 21:42:21.037',  '5'),
('555-555-5555',    '2018-08-10 21:25:52.948',  '6'),
('555-555-5555',    '2018-04-25 19:55:04.158',  '7'),
('555-555-5555',    '2018-04-25 19:54:01.818',  '8'),
('555-555-5555',    '2018-04-25 19:49:55.074',  '9'),
('555-555-5555',    '2018-04-24 20:57:27.768',  '10'),
('555-555-5555',    '2018-04-24 20:44:42.660',  '11'),
('555-555-5555',    '2018-03-19 15:56:12.395',  '12'),
('555-555-5555',    '2018-02-23 17:49:14.203',  '13'),
('555-555-5555',    '2018-02-23 17:44:16.258',  '14'),
('555-555-5555',    '2018-02-19 17:31:34.654',  '15')
>尝试:
DROP TABLE #AllCalls     
SELECT RemoteID ,InitiatedDateTimeUTC, 
row_id = ROW_NUMBER() OVER (PARTITION BY RemoteID 
ORDER BY InitiatedDateTimeUTC ASC)  
INTO #AllCalls
FROM InteractionSummary 
WHERE ConnectionType = 1 and Direction = 1 
AND InitiatedDateTimeUTC > '1/1/2017' AND RemoteID IS NOT NULL    

SELECT a.*, DATEDIFF(DAY,a.InitiatedDateTimeUTC, u.InitiatedDateTimeUTC)
FROM #AllCalls a
LEFT JOIN #AllCalls u ON (a.RemoteID = u.RemoteID 
AND a.row_id + 1 = u.row_id 
AND DATEDIFF(DAY,a.InitiatedDateTimeUTC, u.InitiatedDateTimeUTC) BETWEEN 0 and 30 )
WHERE u.row_id IS NULL --Only unique 
ORDER BY 1,2
<小时 />

期望结果:

555-555-5555
RemoteIDCount
4

如果我理解正确,您要计算相隔 30 天或更长时间的电话号码之间的差距数。 您可以简单地使用lag()和条件聚合:

select d.phone,
sum(case when prev_InitiatedDateTimeUTC > dateadd(day, -30, InitiatedDateTimeUTC) then 0 else 1 end)
from (select d.*,
lag(InitiatedDateTimeUTC) over (partition by phone order by InitiatedDateTimeUTC) as prev_InitiatedDateTimeUTC
from data d
) d
group by d.phone;

这是一个数据库<>小提琴。

这可能满足也可能不符合您的需求,但据我所知,它击中了您提出的所有要点。

SELECT  t.Grouping
,t.phone
,COUNT(1) AS CallCount
FROM    (
SELECT  DISTINCT
CAST(t.MonthGroup AS VARCHAR)+'-'+CAST(t.YearGroup AS VARCHAR) AS Grouping
,d.phone
,d.InitiatedDateTimeUTC
,d.rn
FROM    #data AS d
INNER JOIN (
SELECT   d.InitiatedDateTimeUTC AS FirstDate
,d2.InitiatedDateTimeUTC AS LastDate
,MONTH(d.InitiatedDateTimeUTC) AS MonthGroup
,YEAR(d.InitiatedDateTimeUTC) AS YearGroup
,DATEDIFF(DAY, d.InitiatedDateTimeUTC, GETDATE()) AS Age
FROM     #data AS d
CROSS JOIN #data AS d2
WHERE    DATEDIFF(DAY, d.InitiatedDateTimeUTC, d2.InitiatedDateTimeUTC) <= 30
AND DATEDIFF(DAY, d.InitiatedDateTimeUTC, d2.InitiatedDateTimeUTC) > 0
) AS t ON (
d.InitiatedDateTimeUTC = t.FirstDate
OR d.InitiatedDateTimeUTC = t.LastDate
)
WHERE   1 = 1
AND d.InitiatedDateTimeUTC > @StartDate
) AS t
GROUP BY t.Grouping
,t.phone

这是您可以尝试的可能解决方案。

它利用了一个数字列表/计数表 - 我正在为此示例动态生成一个,但您将有一个永久表可用。

with numbers as (select top(365) Row_Number() over(order by (select null)) as n from sys.messages),
dates as (select n,DateAdd(day,n,(select Min(InitiatedDateTimeUTC) from #data)) as date from numbers),
datelist as (select *, Lead(date) over(order by n) ndate from dates)
select d.phone, Count(distinct Floor(n/30.0))[Count]
from datelist dl 
join #data d on d.InitiatedDateTimeUTC>=dl.date and d.InitiatedDateTimeUTC < dl.ndate
group by d.phone

见工作小提琴

最新更新