我有两个表,人员和违规。Person表由以下列组成:id、name、DOB while违规表:violation_id,person_id。
(Violation_id不是密钥,因为事件可能涉及多个人。(
我想找出每个年龄组的平均违规次数:18-40、41-65和66+。所以,这是我第一次尝试为每个组返回相同的结果。有人能指导我解决这个问题吗?提前谢谢。
WITH ages AS
(
select EXTRACT(YEAR FROM age(cast(DOB as date))) as age
from person
)
SELECT
avg(violation_count) AS avg_violation_count,
case
when age between 18 and 40 then 1
when age between 41 and 65 then 2
when age >= 66 then 3
end as age_category
FROM (Select count(violation_id) as violation_count
from violations
group by violation_id) V, ages
natural join violations
group by age_category
;
您需要在组之间有一个join
——我只是建议您永远不要使用natural join
。忘记它的存在。
如果你想统计每个年龄组的违规行为,那么如下所示:
select (case when age < 18 then '< 18'
when age <= 40 then 'between 18 and 40'
when age <= 65 then 'between 41 and 65'
else '> 65'
end) as age_category,
count(*)
from violations v join
persons p
using (person_id)
group by age_category ;
当然,你可以只用数字。然而,这个版本使用字符串,因此这些值是可解释的——如果有任何冲突,则包括18岁以下的值。
如果您想要每组中所有违规的比例(这是而不是"平均值"(,那么您只需使用窗口函数:
select (case when age < 18 then '< 18'
when age <= 40 then 'between 18 and 40'
when age <= 65 then 'between 41 and 65'
else '> 65'
end) as age_category,
count(*),
count(*) * 1.0 / sum(count(*)) over ()
from violations v join
persons p
using (person_id)
group by age_category ;
以下是步骤:
- 首先按组计算事件数
- 使用子查询计算百分比
您需要像这样编写查询
select
age_category, count_, (count_*100)/sum(count_) over() "avg_violation_count"
from(
select
case
when EXTRACT(YEAR FROM age(cast(DOB as date))) between 18 and 40 then 1
when EXTRACT(YEAR FROM age(cast(DOB as date))) between 41 and 65 then 2
when EXTRACT(YEAR FROM age(cast(DOB as date))) >= 66 then 3
end as age_category, count(*) "count_"
from
person t1 inner join violations t2 on t1.id=t2.person_id
group by 1
) tab
演示