首页 > 解决方案 > 基于条件具有返回值

问题描述

我正在尝试根据每个 uid 至少出现一次并带有 2 个不同的类别变量从 uid 列返回值:

+--------+--------+---------+
|  uid   |  type  | period  |
+--------+--------+---------+
| abc123 | event1 | control |
| abc123 | event1 | test    |
| def456 | event1 | control |
| def456 | event1 | control |
+--------+--------+---------+

在这种情况下,事件abc1231 的计数将返回 2,因为 uid 出现在测试期间和控制期间,def456不会返回计数,因为它只出现在一个期间,给出中间表:

+--------+-----------+
|  uid   | typecount |
+--------+-----------+
| abc123 |         2 |
+--------+-----------+

到目前为止,这是我的代码:

with cb as(
  select uid, count(type) as cbuffercount, period
    from `AJG.ABV_buff_wperiods`
    where type="bufferStart" and seq>12 and not uid="null" and not uid="" and period="control"
    group by uid, period
    having count(uid)>1),
tb as(
  select uid, count(type) as tbuffercount, period
    from `AJG.ABV_buff_wperiods`
    where type="bufferStart" and seq>12 and not uid="null" and not uid="" and period="test"
    group by uid, period
    having count(uid)>1),
ci as(
  select uid, count(instance) as cinstancecount, period
    from `AJG.ABV_buff_wperiods`
    where seq>12 and not uid="null" and not uid="" and period="control"
    group by uid, period
    having count(uid)>1),
ti as(
    select uid, count(instance) as tinstancecount, period
    from `AJG.ABV_buff_wperiods`
    where seq>12 and not uid="null" and not uid="" and period="test"
    group by uid, period
    having count(uid)>1)
select uid, cb.cbuffercount, tb.tbuffercount, ci.cinstancecount, ti.tinstancecount,
cb.cbuffercount-tb.tbuffercount as absbufferddx, (cb.cbuffercount/ci.cinstancecount)-(tb.tbuffercount/tb.tinstancecount) as proportionalbufferddx
from
  cb join tb
  using(uid)
where
  cb.uid=tb.uid
order by absbufferddx desc

我还有一个额外的问题,with当我尝试从中选择变量时,Bigquery 无法识别我在子句中定义的最后两个表(例如ci.cinstancecount)。我运行了一个查询,包括cb并且tb很好。不知道为什么添加 2 个额外的表会破坏它?

标签: sqlgoogle-bigquery

解决方案


这是你想要的吗?

select uid, count(distinct period)
from t
group by uid
having count(distinct period) >= 2;

如果您想同时计算eventperiod,那么我将推荐字符串处理。BigQuery 不支持count(distinct)数组或结构,因此您不妨这样做:

select uid, count(distinct concat(event, '|', period))
from t
group by uid
having count(distinct concat(event, '|', period)) >= 2;

推荐阅读