首页 > 解决方案 > 如何在 ORACLE SQL 中按周期计算累计第 90 个百分位数

问题描述

任务:我想计算第 90 个百分位等待时间的累积值(以天为单位),以确定患者在每个时期和财政年度(包括所有设施的总数)按设施等待超声扫描的时间。

如何:我使用 over(partition by) 表达式来获取 Oracle SQL 中累积的第 90 个百分位数数据。

ISSUE: 收到错误 (ORA-00979) “wait_time_days 不是 GROUP BY 表达式”,指出累积第 90 个百分位数的表达式 -

这是查询的简化版本:

    define period_start = to_date ('20170401', 'yyyy-mm-dd')
    define period_end = to_date ('20180331', 'yyyy-mm-dd')   

            Select 
            to_char(add_months(c.PROCEDURE_PERFORMED_DT, -3),'YYYY') ||'/'|| to_char(add_months(c.PROCEDURE_PERFORMED_DT, 9),'YY') as Fiscal_year

            , nvl (CASE WHEN c.FACILITY_ID = 1 then 'OKANAGAN CLINIC'
                    WHEN c.FACILITY_ID = 2 THEN 'MELBOURNE CLINIC'
                    WHEN c.FACILITY_ID = 3 THEN 'VANCOUVER CLINIC'
                    WHEN c.FACILITY_ID = 4 THEN 'SHWARZ CLINIC'
                    WHEN c.FACILITY_ID = 5 THEN 'LADYSMITH CLINIC'
                    END, 'TOTAL') as FACILITY,
            , c.PROCEDURE_PERFORMED_DT
            , c.BOOKING_DT
            , round((c.PROCEDURE_PERFORMED_DT - c.BOOKING_FORM_RECEIVED_DT),2) as wait_time
            , p.period

        , round(percentile_cont(0.9) within group (order by wait_time asc),2) as Percentile_90th

        , percentile_cont(0.9) within group (order by wait_time asc)
     over(partition by CASE WHEN c.FACILITY_ID = 1 then 'OKANAGAN CLINIC'
                    WHEN c.FACILITY_ID = 2 THEN 'MELBOURNE CLINIC'
                    WHEN c.FACILITY_ID = 3 THEN 'VANCOUVER CLINIC'
                    WHEN c.FACILITY_ID = 4 THEN 'SHWARZ CLINIC'
                    WHEN c.FACILITY_ID = 5 THEN 'LADYSMITH CLINIC'
                    END) as Cumulative_Perc90th

    FROM COMPLETED_PROCEDURE c, PERIOD p   

     WHERE c.PROCEDURE_PERFORMED_DT between &period_start and &period_end
                   and c.PROCEDURE_PERFORMED_DT = p.END_DATE

                Group by 
                Fiscal_year
                , Rollup(CASE WHEN c.FACILITY_ID = 1 then 'OKANAGAN CLINIC'
                    WHEN c.FACILITY_ID = 2 THEN 'MELBOURNE CLINIC'
                    WHEN c.FACILITY_ID = 3 THEN 'VANCOUVER CLINIC'
                    WHEN c.FACILITY_ID = 4 THEN 'SHWARZ CLINIC'
                    WHEN c.FACILITY_ID = 5 THEN 'LADYSMITH CLINIC'
                    END)
                , period

Order By Facility, period
;

PERIOD 表 (p) 包含一些期间日期的列表,如下所示:

PERIOD   START_DATE     END_DATE

    01  2017-04-01  2017-04-20

    02  2017-04-21  2017-05-18

    03  2017-05-19  2017-06-15

    04  2017-06-16  2017-07-13

谢谢您的帮助!

PS我正在使用相同的方法来计算累积量(通过对所有计数的扫描求和)并且它可以正常工作。

标签: oraclepercentile

解决方案


我不太确定你想从这个查询中得到什么结果,但通常会出现错误 ORA-00979,因为你试图在聚合的 GROUP BY 查询中使用包含 OVER(...) 子句的PERCENTILE_CONT分析版本.


我尝试用一​​个简化的例子来展示这个问题,我希望这能帮助你理解错误是什么以及应该如何使用这个函数。

假设我们有下表,其中包含 3 个诊所和每个诊所的 10 条记录:

CREATE TABLE example(
  clinic varchar2(100),
  wait_days_period int
);

INSERT ALL
INTO example VALUES('MELBOURNE CLINIC', x )
INTO example VALUES('VANCOUVER CLINIC', 10 + x )
INTO example VALUES('SHWARZ CLINIC', 10 * x )
SELECT level as X FROM dual
CONNECT BY LEVEL <= 10;

查询的聚合版本如下所示 - 它包含 PERCENTILE_CONT 函数,没有GROUP BY 查询中使用的OVER 子句

SELECT clinic,
       PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY wait_days_period) as percen
FROM example
GROUP BY clinic;

|           CLINIC | PERCEN |
|------------------|--------|
| MELBOURNE CLINIC |    9.1 |
|    SHWARZ CLINIC |     91 |
| VANCOUVER CLINIC |   19.1 |

查询的 analyticac 版本如下所示 - 它包含简单(非 GROUP BY)查询中使用的PERCENTILE_CONT 函数WITH OVER 子句

SELECT clinic, wait_days_period,
       PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY wait_days_period) 
       OVER (PARTITION BY clinic) as percen
FROM example  ORDER BY 1,2;

|           CLINIC | WAIT_DAYS_PERIOD | PERCEN |
|------------------|------------------|--------|
| MELBOURNE CLINIC |                1 |    9.1 |
| MELBOURNE CLINIC |                2 |    9.1 |
| MELBOURNE CLINIC |                3 |    9.1 |
| MELBOURNE CLINIC |                4 |    9.1 |
| MELBOURNE CLINIC |                5 |    9.1 |
| MELBOURNE CLINIC |                6 |    9.1 |
| MELBOURNE CLINIC |                7 |    9.1 |
| MELBOURNE CLINIC |                8 |    9.1 |
| MELBOURNE CLINIC |                9 |    9.1 |
| MELBOURNE CLINIC |               10 |    9.1 |
|    SHWARZ CLINIC |               10 |     91 |
|    SHWARZ CLINIC |               20 |     91 |
|    SHWARZ CLINIC |               30 |     91 |
|    SHWARZ CLINIC |               40 |     91 |
|    SHWARZ CLINIC |               50 |     91 |
|    SHWARZ CLINIC |               60 |     91 |
|    SHWARZ CLINIC |               70 |     91 |
|    SHWARZ CLINIC |               80 |     91 |
|    SHWARZ CLINIC |               90 |     91 |
|    SHWARZ CLINIC |              100 |     91 |
| VANCOUVER CLINIC |               11 |   19.1 |
| VANCOUVER CLINIC |               12 |   19.1 |
| VANCOUVER CLINIC |               13 |   19.1 |
| VANCOUVER CLINIC |               14 |   19.1 |
| VANCOUVER CLINIC |               15 |   19.1 |
| VANCOUVER CLINIC |               16 |   19.1 |
| VANCOUVER CLINIC |               17 |   19.1 |
| VANCOUVER CLINIC |               18 |   19.1 |
| VANCOUVER CLINIC |               19 |   19.1 |
| VANCOUVER CLINIC |               20 |   19.1 |

简而言之,分析版本计算的值与聚合版本完全相同(在 GROUP BY 查询中),但允许将其用于普通查询中的每条记录,而无需执行额外的 GROUP BY 子查询。


如果您尝试在 GROUP BY 查询中使用该函数的分析版本(带有 OVER... 子句),您将得到ORA-00979: not a GROUP BY expression,因为这是不允许的:

SELECT clinic,
       PERCENTILE_CONT(0.9) WITHIN GROUP (ORDER BY wait_days_period) 
       OVER (PARTITION BY clinic) as percen
FROM example
GROUP BY clinic;

ORA-00979: not a GROUP BY expression

推荐阅读