首页 > 解决方案 > 在 Snowflake 中使用 Count Distinct 和 Pivot

问题描述

我正在尝试对列进行旋转Join_mon并获取每个列的聚合计数ID,如以下查询所示;

select *
from CTE3
pivot(COUNT(DISTINCT platform_payer_name) for Join_Mon in (
 '2021-03-01',
 '2021-02-01',
  '2021-01-01',
 '2020-12-01'

        ))
  as p
order by ID
)

如您所见,我正在尝试将她列为该列的不同数量platform_payer_name。但它给出了以下错误;

SQL compilation error: syntax error line 48 at position 16 unexpected 'DISTINCT'

我非常积极地DISTINCTCOUNT雪花合作。我可以得到一些帮助,为什么它在这里失败。帮助表示赞赏。

标签: pivotsnowflake-cloud-data-platform

解决方案


所以制作一些映射到你的枢轴的假数据,尽管我放弃了过多的paren

with cte3(id, platform_payer_name, Join_Mon) as (
    select * from values
        (1,'aa', '2021-03-01'),
        (1,'aa', '2021-03-01'),
        (1,'aa', '2021-03-01'),
        (1,'aa', '2021-02-01'),
        (2,'bb', '2012-03-01'),
        (2,'cc', '2020-12-01')
)
select *
from CTE3 AS c
pivot(COUNT(c.platform_payer_name) for c.Join_Mon in (
         '2021-03-01',
         '2021-02-01',
         '2021-01-01',
         '2020-12-01' )
) as p
order by id;

给出:

ID  '2021-03-01'    '2021-02-01'    '2021-01-01'    '2020-12-01'
1   3               1               0               0
2   0               0               0               1

distinct所以你想要在那里是有道理的

但似乎不支持..

所以虽然它有点容易被剪切粘贴错误,但它确实“工作”:

with cte3(id, platform_payer_name, Join_Mon) as (
    select * from values
        (1,'aa', '2021-03-01'),
        (1,'aa', '2021-03-01'),
        (1,'aa', '2021-03-01'),
        (1,'aa', '2021-02-01'),
        (2,'bb', '2012-03-01'),
        (2,'cc', '2020-12-01')
)
select id
    ,count(distinct(iff(Join_Mon='2021-03-01',platform_payer_name,null))) as "2021-03-01"
    ,count(distinct(iff(Join_Mon='2021-02-01',platform_payer_name,null))) as "2021-02-01"
    ,count(distinct(iff(Join_Mon='2021-01-01',platform_payer_name,null))) as "2021-01-01"
    ,count(distinct(iff(Join_Mon='2020-12-01',platform_payer_name,null))) as "2020-12-01"
from CTE3 AS c
group by 1 order by 1;

给出:

ID  2021-03-01  2021-02-01  2021-01-01  2020-12-01
1   1           1           0           0
2   0           0           0           1

之所以有效,是因为 pivot 正在执行两项任务,如果与输入匹配,第一项是将值移动到列中,因此与以下内容相同:

with cte3(id, platform_payer_name, Join_Mon) as (
select * from values
    (1,'aa', '2021-03-01'),
    (1,'aa', '2021-03-01'),
    (1,'aa', '2021-03-01'),
    (1,'aa', '2021-02-01'),
    (2,'bb', '2012-03-01'),
    (2,'cc', '2020-12-01')
)
select id
    ,iff(Join_Mon='2021-03-01',platform_payer_name,null) as "2021-03-01"
    ,iff(Join_Mon='2021-02-01',platform_payer_name,null) as "2021-02-01"
    ,iff(Join_Mon='2021-01-01',platform_payer_name,null) as "2021-01-01"
    ,iff(Join_Mon='2020-12-01',platform_payer_name,null) as "2020-12-01"
from CTE3 AS c
order by 1;

这使:

ID, 2021-03-01, 2021-02-01, 2021-01-01, 2020-12-01
1,  aa,         NULL,       NULL,       NULL
1,  aa,         NULL,       NULL,       NULL
1,  aa,         NULL,       NULL,       NULL
1,  NULL,       aa,         NULL,       NULL
2,  NULL,       NULL,       NULL,       NULL
2,  NULL,       NULL,       NULL,       cc

然后可以count(distinct x)遍历每一列。

select id
    ,count(distinct("2021-03-01")) as "2021-03-01"
    ,count(distinct("2021-02-01")) as "2021-02-01"
    ,count(distinct("2021-01-01")) as "2021-01-01"
    ,count(distinct("2020-12-01")) as "2020-12-01"
from (
    select id
        ,iff(Join_Mon='2021-03-01',platform_payer_name,null) as "2021-03-01"
        ,iff(Join_Mon='2021-02-01',platform_payer_name,null) as "2021-02-01"
        ,iff(Join_Mon='2021-01-01',platform_payer_name,null) as "2021-01-01"
        ,iff(Join_Mon='2020-12-01',platform_payer_name,null) as "2020-12-01"
    from CTE3 AS c
)
group by id
order by id;

或者可以像我在第一个答案中显示的那样内联完成。


推荐阅读