如何使用SQL计算列中非连续值的数量?



在这里跟进我的问题。假设我在 Oracle 数据库中有一个表,如下所示 (table_1(,用于跟踪特定个人的服务参与情况:

name  day  srvc_ inv
bill  1  1
bill  2  1
bill  3  0
bill  4  0
bill  5  1
bill  6  0
susy  1  1
susy  2  0
susy  3  1
susy  4  0
susy  5  1

我的目标是获得一个汇总表,其中列出了所有唯一个人是否参与服务以及不同服务集的数量(在本例中为 2 个用于账单,3 个用于 susy(,其中不同的服务集由几天内的活动中断来标识。

要获得任何服务参与,我将使用以下查询

SELECT table_1."Name", MAX(table_1."Name") AS "any_invl"
FROM table_1
GROUP BY table_1."Name"

但是,我无法获得服务参与的数量(2(。在 R 中使用静态数据帧,您将使用运行长度编码(请参阅我最初的问题(,但我不知道如何在 SQL 中完成此操作。此操作将对大量记录运行,因此将整个数据帧存储为对象,然后在 R 中运行它是不切实际的。

编辑:我期望的输出如下:

name  any_invl  n_srvc_inv
bill  1  2
susy  1  3

感谢您的任何帮助!

像这样的东西?

SQL> with test (name, day, srvc_inv) as
2    (select 'bill', 1, 1 from dual union all
3     select 'bill', 2, 1 from dual union all
4     select 'bill', 3, 0 from dual union all
5     select 'bill', 4, 0 from dual union all
6     select 'bill', 5, 1 from dual union all
7     select 'bill', 6, 0 from dual union all
8     select 'susy', 1, 1 from dual union all
9     select 'susy', 2, 0 from dual union all
10     select 'susy', 3, 1 from dual union all
11     select 'susy', 4, 0 from dual union all
12     select 'susy', 5, 1 from dual
13    ),
14  inter as
15    (select name, day, srvc_inv,
16       nvl(lead(srvc_inv) over (partition by name order by day), 0) lsrvc
17     from test
18    )
19  select name,
20    sum(case when srvc_inv <> lsrvc and lsrvc = 0 then 1
21             else 0
22        end) grp
23  from inter
24  group by name;
NAME        GRP
---- ----------
bill          2
susy          3
SQL>

我建议使用lag(). 这个想法是计算"1",但仅当前面的值为零或null时:

select name, count(*)
from (select t.*,
lag(srvc_inv) over (partition by name order by day) as prev_srvc_inv
from t
) t
where (prev_srvc_inv is null or prev_srvc_inv = 0) and
srvc_inv = 1
group by name;

您可以通过使用lag()的默认值来简化这一点:

select name, count(*)
from (select t.*,
lag(srvc_inv, 1, 0) over (partition by name order by day) as prev_srvc_inv
from t
) t
where prev_srvc_inv = 0 and srvc_inv = 1
group by name;

您可以尝试以下查询,具有LAG函数来处理srvc_invl的变化

select name, 1 any_invl, count(case when diff = 1 then 1 end) n_srvc_inv
from (select name, day, srvc_inv - LAG(srvc_inv, 1, 0) OVER(ORDER BY name, day) diff
from tab
order by name, day) temp
group by name

这是小提琴供您参考。

相关内容

  • 没有找到相关文章

最新更新