首页 > 解决方案 > BigQuery approx_quantiles WHERE,对不同的函数使用不同的 WHERE

问题描述

我们有一个表,我们正在尝试计算表中不同列的分位数:

with t1 as (
    select 'a' as category, 0.25 as stat1, 2 as stat1ct, 0.82 as stat2, 3 as stat2ct union all
    select 'a' as category, 0.35 as stat1, 4 as stat1ct, 0.68 as stat2, 5 as stat2ct union all
    select 'a' as category, 0.45 as stat1, 3 as stat1ct, 0.74 as stat2, 4 as stat2ct union all
    select 'a' as category, 0.28 as stat1, 0 as stat1ct, 0.72 as stat2, 0 as stat2ct union all
    select 'a' as category, 0.36 as stat1, 0 as stat1ct, 0.65 as stat2, 4 as stat2ct union all
    select 'a' as category, 0.63 as stat1, 1 as stat1ct, 0.53 as stat2, 3 as stat2ct union all
    select 'a' as category, 0.18 as stat1, 5 as stat1ct, 0.52 as stat2, 1 as stat2ct union all
    select 'a' as category, 0.43 as stat1, 3 as stat1ct, 0.57 as stat2, 2 as stat2ct
)

select
    approx_quantiles(stat1, 100) as atr2FgPct
    ,approx_quantiles(stat2, 100) as paint2FgPct
from t1

这很好用。但是,我们希望通过使用另一列基于 WHERE 条件过滤每一列来编辑它。我们正在寻找这样的东西:

select
    approx_quantiles(stat1 where stat1ct > 2, 100) as atr2FgPct
    ,approx_quantiles(stat2 where stat2ct > 2, 100) as paint2FgPct
from t1

...其中stat1分位数仅基于大于 2 的stat1stat1ct。如果stat1ct小于 2,则 的值stat1不应计入分位数。这可以在bigquery中做到吗?

标签: google-bigquery

解决方案


考虑以下方法

select
   approx_quantiles(if(stat1ct > 2, stat1, null), 100) as atr2FgPct
  ,approx_quantiles(if(stat2ct > 2, stat2, null), 100) as paint2FgPct
from t1        

注意:APPROX_QUANTILES 支持 IGNORE NULLS 和 RESPECT NULLS

如果指定了 IGNORE NULLS,则从结果中排除 NULL 值。如果指定了 RESPECT NULLS,则 NULL 值包含在结果中。如果两者均未指定,则从结果中排除 NULL 值。如果最终查询结果中的数组包含 NULL 元素,则会引发错误。


推荐阅读