我需要一些帮助来编写一个查询,该查询将返回YTD和每个月活跃的memberid的不同计数
。
202201 -在202201活动的唯一成员id
202202 -在202201 - 202202之间活动的唯一成员
202203 -在202201 - 202203之间活动的唯一成员
的数据结构与下面类似。
memberid | yearmonth | activestatus | 1 | 202201 | Y | 1
---|---|---|
202202 | Y | |
202203 | N | |
202201 | N | |
202202 | N | |
202203 | Y | |
3 | 202201 | N |
3 | 202202 | Y |
3 | 202203 | Y |
我们可以用两层聚合整齐地表达逻辑。这可能比使用row_number()
:
select yearmonth, sum(count(*)) over(order by yearmonth) running_cnt
from (
select min(yearmonth) yearmonth from mytable where activestatus = 'Y' group by memberid
) t
group by yearmonth
这首先获得每个成员最早的活动日期,然后汇总并计算运行总和。
在正常情况下,您还会有一个保存所有可能的yearmonth
值的日历表,您可以这样使用它:
select c.yearmonth, sum(count(t.yearmonth)) over(order c.by yearmonth) running_cnt
from calendar c
left join (
select min(yearmonth) yearmonth from mytable where activestatus = 'Y' group by memberid
) t on t.yearmonth = c.yearmonth
group by c.yearmonth
我们正在寻找一个不同的运行计数。我们可以用row_number()
和count()
来实现。
select distinct yearmonth
,count(rn) over(order by yearmonth) as distinct_running_count
from
(
select *
,case when row_number() over(partition by memberid order by yearmonth) = 1 then 1 end as rn
from t
where activestatus = 'Y'
) t
yearmonth | distinct_running_count | 202201 | 1 |
---|---|
202202 | 2 |
202203 | 3 |