返回每个患者每月过去7次就诊中就诊次数最多的提供者



目前有一个"访问"数据集,包含日期、患者、提供者、预约级别(该患者的预约序号)和预约/提供者级别(该患者/提供者的预约序号)

需要使用它在每个月的第一天(2020年1月1日开始)进行回顾,对于每个患者,查看之前的7次就诊(没有距离限制),并获取该月与该患者关联的就诊次数最多的提供者。

如果患者就诊少于7次,仍然适用此逻辑。如果患者的就诊次数为偶数,并且导致这些提供者的数量相等,那么获胜者将是最近的一个。患者在该月期间可能没有任何预约。

例如病人A的基本数据是这样的。

ApptDate | Patient | Provider | ApptRank | ApptProvRank |
1/15/2020|    A    |    BOB   |    1     |      1       |
3/01/2020|    A    |    BOB   |    2     |      2       |
3/08/2020|    A    |    BOB   |    3     |      3       |
3/20/2020|    A    |    BOB   |    4     |      4       |
4/01/2020|    A    |    BOB   |    5     |      5       |
4/15/2020|    A    |    SMITH |    6     |      1       |
6/07/2020|    A    |    SMITH |    7     |      2       |
6/21/2020|    A    |    BOB   |    8     |      6       |
7/01/2020|    A    |    JANE  |    9     |      1       |
7/15/2020|    A    |    JANE  |    10    |      2       |
7/21/2020|    A    |    JANE  |    11    |      3       |
8/01/2020|    A    |    JANE  |    12    |      4       |
8/15/2020|    A    |    JANE  |    13    |      5       |
9/20/2020|    A    |    JANE  |    14    |      6       |

然后,我有一个流程来填充缺失的月份,并计算当月每个提供商的应用程序数量,看起来像这样…

Month    | Patient | Provider |   Appts    |Cumulative Appt Rank | Cumulative Appt Provider Rank
1/1/2020 |     A   |    BOB   |     1      |   1                 |   1
2/1/2020 |     A   |    NULL  |    NULL    |  NULL               |   NULL
3/1/2020 |     A   |    BOB   |     3      |   4                 |   4
4/1/2020 |     A   |    BOB   |     1      |   5                 |   5
4/1/2020 |     A   |    SMITH |     1      |   6                 |   1
5/1/2020 |     A   |    NULL  |    NULL    |  NULL               |   NULL
6/1/2020 |     A   |    SMITH |     1      |   7                 |   2
6/1/2020 |     A   |    BOB   |     1      |   8                 |   6
7/1/2020 |     A   |    JANE  |     3      |   11                |   3
8/1/2020 |     A   |    JANE  |     2      |   13                |   5
9/1/2020 |     A   |    JANE  |     2      |   14                |   6

我想要的是看起来像这样的东西:(一个月一个供应商和应用程序的份额在之前的7 -不应该高于7)

Month    | Patient | Provider | Appt Count (of Past 7)
2/1/2020 |     A   |    BOB   |     1      |            ----> starting 2/1/2020 for the 'lookback'
3/1/2020 |     A   |    BOB   |     1      |
4/1/2020 |     A   |    BOB   |     4      |
5/1/2020 |     A   |    BOB   |     5      |  
6/1/2020 |     A   |    BOB   |     5      |  
7/1/2020 |     A   |    BOB   |     5      |  
8/1/2020 |     A   |    JANE  |     3      |  
9/1/2020 |     A   |    JANE  |     5      |  
10/1/2020|     A   |    JANE  |     6      |

这将涉及多个患者,但为了简化本例,它只有一个。非常感谢对这个问题的任何帮助

将示例数据转换为DDL/DML后:

DECLARE @Visits TABLE (ID INT IDENTITY, ApptDate DATETIME, Patient NVARCHAR(10), Provider NVARCHAR(10), ApptRank INT, ApptProvRank INT);
INSERT INTO @Visits (ApptDate, Patient, Provider, ApptRank, ApptProvRank) VALUES
('1/15/2020',    'A',    'BOB   ',    1 ,      1),
('3/01/2020',    'A',    'BOB   ',    2 ,      2),
('3/08/2020',    'A',    'BOB   ',    3 ,      3),
('3/20/2020',    'A',    'BOB   ',    4 ,      4),
('4/01/2020',    'A',    'BOB   ',    5 ,      5),
('4/15/2020',    'A',    'SMITH ',    6 ,      1),
('6/07/2020',    'A',    'SMITH ',    7 ,      2),
('6/21/2020',    'A',    'BOB   ',    8 ,      6),
('7/01/2020',    'A',    'JANE  ',    9 ,      1),
('7/15/2020',    'A',    'JANE  ',    10,      2),
('7/21/2020',    'A',    'JANE  ',    11,      3),
('8/01/2020',    'A',    'JANE  ',    12,      4),
('8/15/2020',    'A',    'JANE  ',    13,      5),
('9/20/2020',    'A',    'JANE  ',    14,      6),
('1/15/2021',    'B',    'BOB   ',    1 ,      1),
('3/01/2021',    'B',    'BOB   ',    2 ,      2),
('3/08/2021',    'B',    'BOB   ',    3 ,      3),
('3/20/2021',    'B',    'BOB   ',    4 ,      4),
('4/01/2021',    'B',    'BOB   ',    5 ,      5),
('4/15/2021',    'B',    'SMITH ',    6 ,      1),
('6/07/2021',    'B',    'SMITH ',    7 ,      2),
('6/21/2021',    'B',    'JANE  ',    8 ,      6),
('7/01/2021',    'B',    'JANE  ',    9 ,      1),
('7/15/2021',    'B',    'JANE  ',    10,      2),
('7/21/2021',    'B',    'JANE  ',    11,      3),
('8/01/2021',    'B',    'JANE  ',    12,      4),
('8/15/2021',    'B',    'JANE  ',    13,      5),
('9/20/2021',    'B',    'JANE  ',    14,      6);

我们可以通过窗口函数查看这些数据,以确定患者在过去7次就诊(无论持续时间如何)中最受欢迎的提供者,然后连接回它以查看每次预约的7次就诊窗口中最受欢迎的提供者。

;WITH top7VisitProvider AS (
SELECT patient, a.ApptRank, a.v7Provider, a.ProviderLast7Cnt, ROW_NUMBER() OVER (PARTITION BY a.Patient, a.ApptRank ORDER BY a.ProviderLast7Cnt DESC) AS rn
FROM (
SELECT v.ID, v.ApptDate, v.Patient, v.Provider, v.ApptRank, v.ApptProvRank, vr.Provider AS v7Provider, COUNT(*) AS ProviderLast7Cnt
FROM @Visits v
LEFT OUTER JOIN @Visits vr
ON v.Patient = vr.Patient
AND vr.ApptRank BETWEEN v.ApptRank-7 AND v.ApptRank
GROUP BY v.ID, v.ApptDate, v.Patient, v.Provider, v.ApptRank, v.ApptProvRank, vr.Provider
) a
)
SELECT v.ID, v.ApptDate, v.Patient, v.Provider, v.ApptRank, v.ApptProvRank, t7vp.v7Provider
FROM @Visits v
INNER JOIN top7VisitProvider t7vp
ON v.Patient = t7vp.Patient
AND v.ApptRank = t7vp.ApptRank
AND t7vp.rn = 1
ID  ApptDate                Patient Provider    ApptRank    ApptProvRank    v7Provider
--------------------------------------------------------------------------------------
1   2020-01-15 00:00:00.000 A       BOB         1           1               BOB   
2   2020-03-01 00:00:00.000 A       BOB         2           2               BOB   
3   2020-03-08 00:00:00.000 A       BOB         3           3               BOB   
4   2020-03-20 00:00:00.000 A       BOB         4           4               BOB   
5   2020-04-01 00:00:00.000 A       BOB         5           5               BOB   
6   2020-04-15 00:00:00.000 A       SMITH       6           1               BOB   
7   2020-06-07 00:00:00.000 A       SMITH       7           2               BOB   
8   2020-06-21 00:00:00.000 A       BOB         8           6               BOB   
9   2020-07-01 00:00:00.000 A       JANE        9           1               BOB   
10  2020-07-15 00:00:00.000 A       JANE        10          2               BOB   
11  2020-07-21 00:00:00.000 A       JANE        11          3               BOB   
12  2020-08-01 00:00:00.000 A       JANE        12          4               JANE  
13  2020-08-15 00:00:00.000 A       JANE        13          5               JANE  
14  2020-09-20 00:00:00.000 A       JANE        14          6               JANE  
15  2021-01-15 00:00:00.000 B       BOB         1           1               BOB   
16  2021-03-01 00:00:00.000 B       BOB         2           2               BOB   
17  2021-03-08 00:00:00.000 B       BOB         3           3               BOB   
18  2021-03-20 00:00:00.000 B       BOB         4           4               BOB   
19  2021-04-01 00:00:00.000 B       BOB         5           5               BOB   
20  2021-04-15 00:00:00.000 B       SMITH       6           1               BOB   
21  2021-06-07 00:00:00.000 B       SMITH       7           2               BOB   
22  2021-06-21 00:00:00.000 B       JANE        8           6               BOB   
23  2021-07-01 00:00:00.000 B       JANE        9           1               BOB   
24  2021-07-15 00:00:00.000 B       JANE        10          2               BOB   
25  2021-07-21 00:00:00.000 B       JANE        11          3               JANE  
26  2021-08-01 00:00:00.000 B       JANE        12          4               JANE  
27  2021-08-15 00:00:00.000 B       JANE        13          5               JANE  
28  2021-09-20 00:00:00.000 B       JANE        14          6               JANE   

让我们从创建一些示例表开始。我选择创建一个包含"引用年月"的表,另外还有一列表示最古老的历史日期。这个历史列不是必需的,但是我把它放在这里是为了简化查询。

DROP TABLE IF EXISTS bu_datalake.ReferenceDate;
CREATE table bu_datalake.ReferenceDate
(
MonthlyReferenceDate           Date,
MaxMonthlyReferenceDateHistory Date
);   
INSERT 
INTO bu_datalake.ReferenceDate
VALUES ('2022-01-01',ADD_MONTHS('2022-01-01',-7)), 
('2022-02-01',ADD_MONTHS('2022-02-01',-7)),
('2022-03-01',ADD_MONTHS('2022-03-01',-7)),
('2022-04-01',ADD_MONTHS('2022-04-01',-7)),
('2022-05-01',ADD_MONTHS('2022-05-01',-7)),
('2022-06-01',ADD_MONTHS('2022-06-01',-7)),
('2022-07-01',ADD_MONTHS('2022-07-01',-7)),
('2022-08-01',ADD_MONTHS('2022-08-01',-7)),
('2022-09-01',ADD_MONTHS('2022-09-01',-7)),
('2022-10-01',ADD_MONTHS('2022-10-01',-7)),
('2022-11-01',ADD_MONTHS('2022-11-01',-7)),
('2022-12-01',ADD_MONTHS('2022-12-01',-7));
DROP TABLE IF EXISTS bu_datalake.PatientProvider;
CREATE table bu_datalake.PatientProvider
(
ApptDate Date,
Patient  String,
Provider String
);

INSERT 
INTO bu_datalake.PatientProvider
VALUES
('2022-01-15','A','BOB'),   
('2022-03-01','A','BOB'),   
('2022-03-08','A','BOB'),   
('2022-03-20','A','BOB'),   
('2022-04-01','A','BOB'),   
('2022-04-15','A','SMITH'), 
('2022-06-07','A','SMITH'), 
('2022-06-21','A','BOB'),   
('2022-07-01','A','JANE'),  
('2022-07-15','A','JANE'),  
('2022-07-21','A','JANE'),  
('2022-08-01','A','JANE'),  
('2022-08-15','A','JANE'),  
('2022-09-20','A','JANE')     

下面是创建的表的内容:

SELECT * FROM bu_datalake.ReferenceDate;
MonthlyReferenceDate    MaxMonthlyReferenceDateHistory
----------------------------------------------------
2022-01-01            2021-06-01
2022-02-01            2021-07-01
2022-03-01            2021-08-01
2022-04-01            2021-09-01
2022-05-01            2021-10-01
2022-06-01            2021-11-01
2022-07-01            2021-12-01
2022-08-01            2022-01-01
2022-09-01            2022-02-01
2022-10-01            2022-03-01
2022-11-01            2022-04-01
2022-12-01            2022-05-01
SELECT * FROM bu_datalake.PatientProvider;
ApptDate        Patient   Provider
-----------------------------------------------------
2022-01-15    A         BOB         
2022-03-01    A         BOB         
2022-03-08    A         BOB         
2022-03-20    A         BOB         
2022-04-01    A         BOB         
2022-04-15    A         SMITH       
2022-06-07    A         SMITH       
2022-06-21    A         BOB         
2022-07-01    A         JANE        
2022-07-15    A         JANE        
2022-07-21    A         JANE        
2022-08-01    A         JANE        
2022-08-15    A         JANE        
2022-09-20    A         JANE        

让我们按参考月合并考勤,按患者和提供者分组。我们正在创建一个列name MostRecentApptDate因为我们可能需要它有相同数量的患者-提供者出席月:

SELECT TRUNC(ApptDate,'MM') As ApptMonthlyDate,        
Patient,                                
Provider,                               
MAX(ApptDate) MostRecentApptDate,       
COUNT(*) PatientProviderOccurences      
FROM bu_datalake.PatientProvider             
GROUP BY TRUNC(ApptDate,'MM'),Patient,Provider
ORDER BY TRUNC(ApptDate,'MM'),Patient,Provider

ApptMonthlyDate Patient Provider    MostRecentApptDate  PatientProviderOccurences 
2022-01-01      A         BOB         2022-01-15            1
2022-03-01      A         BOB         2022-03-20            3
2022-04-01      A         BOB         2022-04-01            1
2022-04-01      A         SMITH     2022-04-15          1
2022-06-01      A         BOB         2022-06-21            1
2022-06-01      A         SMITH     2022-06-07          1
2022-07-01      A         JANE      2022-07-21          3

现在,我要在这个查询和引用月份表之间建立一个连接。这样做的目的是-对于"每个参考月"-我们可以确定哪些记录我必须考虑在我的"排名分类";(下一步)。因为这个查询可以显示重复的记录(因为一个考勤可以在不同的每月排名分类")我将只显示这些结果的一部分(查询应该返回大约61条记录):

SELECT REF.MonthlyReferenceDate,
PAT.ApptMonthlyDate, 
REF.MaxMonthlyReferenceDateHistory,
PAT.MostRecentApptDate,       
PAT.Patient,
PAT.Provider,
PAT.PatientProviderOccurences        
FROM bu_datalake.ReferenceDate REF
JOIN (SELECT TRUNC(ApptDate,'MM') As ApptMonthlyDate,        
Patient,                                
Provider,                               
MAX(ApptDate) MostRecentApptDate,       
COUNT(*) PatientProviderOccurences      
FROM bu_datalake.PatientProvider             
GROUP BY TRUNC(ApptDate,'MM'),Patient,Provider) PAT
WHERE PAT.ApptMonthlyDate BETWEEN REF.MaxMonthlyReferenceDateHistory AND REF.MonthlyReferenceDate
ORDER BY REF.MonthlyReferenceDate,PAT.Patient,PAT.Provider 

MonthlyReferenceDate    ApptMonthlyDate MaxMonthlyReferenceDateHistory  MostRecentApptDate  Patient Provider    PatientProviderOccurences
2022-01-01            2022-01-01        2021-06-01                      2022-01-15          A         BOB         1
2022-02-01            2022-01-01        2021-07-01                      2022-01-15          A         BOB         1
2022-03-01            2022-03-01        2021-08-01                      2022-03-20          A         BOB         3

然后,我将在这个查询中添加一个额外的列,用于对结果进行分类/排序,这样我就可以根据每月参考日期确定哪些是出席人数最多的。这里我们将使用一个名为"ROW_NUMBER">的窗口窗口函数。还有许多其他函数,如RANK,DENSE_RANK我建议你学习这个主题,因为它对解决这样的问题非常有用。同样,我将只提供部分查询结果:

SELECT REP.MonthlyReferenceDate,
REP.Patient,
REP.Provider,
REP.PatientProviderOccurences,
ROW_NUMBER() OVER(PARTITION BY REP.MonthlyReferenceDate ORDER BY REP.PatientProviderOccurences DESC,REP.MostRecentApptDate DESC) AS Rank
FROM (SELECT REF.MonthlyReferenceDate,
PAT.ApptMonthlyDate, 
REF.MaxMonthlyReferenceDateHistory,
PAT.MostRecentApptDate,       
PAT.Patient,
PAT.Provider,
PAT.PatientProviderOccurences        
FROM bu_datalake.ReferenceDate REF
JOIN (SELECT TRUNC(ApptDate,'MM') As ApptMonthlyDate,        
Patient,                                
Provider,                               
MAX(ApptDate) MostRecentApptDate,       
COUNT(*) PatientProviderOccurences      
FROM bu_datalake.PatientProvider             
GROUP BY TRUNC(ApptDate,'MM'),Patient,Provider) PAT
WHERE PAT.ApptMonthlyDate BETWEEN REF.MaxMonthlyReferenceDateHistory AND REF.MonthlyReferenceDate
ORDER BY REF.MonthlyReferenceDate,PAT.Patient,PAT.Provider) REP;

MonthlyReferenceDate    Patient Provider    PatientProviderOccurences   Rank
2022-01-01            A       BOB         1                         1
2022-02-01            A       BOB         1                         1
2022-03-01            A       BOB         3                         1

最后,我将应用FILTER (Rank=1)只显示我需要的结果:

SELECT MonthlyReferenceDate,
Patient,
Provider,
PatientProviderOccurences
FROM        
(
SELECT REP.MonthlyReferenceDate,
REP.Patient,
REP.Provider,
REP.PatientProviderOccurences,
ROW_NUMBER() OVER(PARTITION BY REP.MonthlyReferenceDate ORDER BY REP.PatientProviderOccurences DESC,REP.MostRecentApptDate DESC) AS Rank
FROM (SELECT REF.MonthlyReferenceDate,
PAT.ApptMonthlyDate, 
REF.MaxMonthlyReferenceDateHistory,
PAT.MostRecentApptDate,       
PAT.Patient,
PAT.Provider,
PAT.PatientProviderOccurences        
FROM bu_datalake.ReferenceDate REF
JOIN (SELECT TRUNC(ApptDate,'MM') As ApptMonthlyDate,        
Patient,                                
Provider,                               
MAX(ApptDate) MostRecentApptDate,       
COUNT(*) PatientProviderOccurences      
FROM bu_datalake.PatientProvider             
GROUP BY TRUNC(ApptDate,'MM'),Patient,Provider) PAT
WHERE PAT.ApptMonthlyDate BETWEEN REF.MaxMonthlyReferenceDateHistory AND REF.MonthlyReferenceDate
ORDER BY REF.MonthlyReferenceDate,PAT.Patient,PAT.Provider) REP
)
WHERE RANK =1
ORDER BY MonthlyReferenceDate
MonthlyReferenceDate    Patient Provider    PatientProviderOccurences
2022-01-01            A       BOB         1
2022-02-01            A       BOB         1
2022-03-01            A       BOB         3
2022-04-01            A       BOB         3
2022-05-01            A       BOB         3
2022-06-01            A       BOB         3
2022-07-01            A       JANE      3
2022-08-01            A       JANE      3
2022-09-01            A       JANE      3
2022-10-01            A       JANE      3
2022-11-01            A       JANE      3
2022-12-01            A       JANE      3
我希望这些查询可以帮助你解决你的问题。根据您的数据库,可能需要应用不同的函数。

相关内容

  • 没有找到相关文章

最新更新