首页 > 解决方案 > 组的 postgresql avg(count)

问题描述

我有 2 张桌子,人员和违规行为。人员表由以下列组成:id,name,DOB 而违规表:违规表:违规ID,人员ID。

(Violation_id 不是关键,因为事件可能涉及多人。)

我想找出每个年龄段的平均违规次数:18-40、41-65 和 66+。所以,这是我的第一次尝试,它为每个组返回相同的结果。谁能指导我解决这个问题?先感谢您。

WITH ages AS

(
select EXTRACT(YEAR FROM age(cast(DOB as date))) as age
from person
)

SELECT
   avg(violation_count) AS avg_violation_count,
 
   case

                    when  age between 18 and 40 then 1
                    when  age between 41 and 65 then 2
                    when  age >= 66 then 3
    end as age_category

FROM  (Select count(violation_id) as violation_count
   from violations
   group by violation_id) V, ages

natural join violations

group by age_category
;

标签: sqlpostgresql

解决方案


你需要join在组之间有一个 - 我只是建议你永远不要使用natural join. 只是忘记它的存在。

如果您想计算每个年龄段的违规行为,则如下所示:

select (case when age < 18 then '< 18'
             when age <= 40 then 'between 18 and 40'
             when age <= 65 then 'between 41 and 65'
             else '> 65'
        end) as age_category,
       count(*)
from violations v join
     persons p
     using (person_id)
group by age_category ;

当然,您可以只使用数字。但是,此版本使用字符串,因此这些值是可解释的——如果它们有任何违规,则包括 18 岁以下。

如果您想要每组中所有违规的比例(这不是“平均值”),那么您只需使用窗口函数:

select (case when age < 18 then '< 18'
             when age <= 40 then 'between 18 and 40'
             when age <= 65 then 'between 41 and 65'
             else '> 65'
        end) as age_category,
       count(*),
       count(*) * 1.0 / sum(count(*)) over ()
from violations v join
     persons p
     using (person_id)
group by age_category ;

推荐阅读