首页 > 解决方案 > bigquery中的多个array_agg

问题描述

我有一张这样的桌子:

在此处输入图像描述

我想根据前两个变量将信息分组到数组中,这就是我所做的

在此处输入图像描述

WITH sample as (
SELECT 1023 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'chop' as type,
'c1023' as id_cus, 'julian' as name, '12345' as phone, 'julian@gmail.com' as    mail
UNION ALL
SELECT 1023 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'ext' as type,
'c1023' as id_cus, 'julian' as name, '12345' as phone, 'julian@gmail.com' as    mail
UNION ALL
SELECT 1021 as id,10 as valuation,'inegi' as    origin, 2021-01-01 as date_lead,'ext' as type,
'in-2020' as id_cus, 'lucian' as name, '12345' as phone, 'lucian@gmail.com' as  mail
UNION ALL
SELECT 1021 as id,10 as valuation,'inegi' as    origin, 2021-01-01 as date_lead,'ext' as type,
'in-2020' as id_cus, 'lucian' as name, '12345' as phone, 'lucian@gmail.com' as  mail
UNION ALL
SELECT 1021 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'int' as type,
'c1021' as id_cus, 'lucian' as name, '12345' as phone, 'lucian@gmail.com' as    mail
UNION ALL
SELECT 1021 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'int' as type,
'c1021' as id_cus, 'lucas' as name, '202342' as phone, 'lucas@gmail.com' as mail
UNION ALL
SELECT 1040 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'type' as type,
'c1040' as id_cus, 'julieta' as name, '202112' as phone, 'julieta@gmail.com' as mail
UNION ALL
SELECT 1040 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'chop' as type,
'c1040' as id_cus, 'julieta' as name, '202112' as phone, 'julieta@gmail.com' as mail
UNION ALL
SELECT 1040 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'rad' as type,
'c1040' as id_cus, 'julieta' as name, '202112' as phone, 'julieta@gmail.com' as mail
UNION ALL
SELECT 1040 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'uls' as type,
'c1040' as id_cus, 'julieta' as name, '123123' as phone, 'julieta@gmail.com' as mail
)
SELECT id,valuation,ARRAY_AGG(STRUCT(origin,date_lead,type,id_cus,name,phone,mail)) as lead 
FROM sample
GROUP BY id,valuation

这里的问题是我注意到在最后三个变量(姓名、电话和邮件)中有很多重复值。我也想将它们分组,但我不知道该怎么做。我注意到我无法在第一个数组中创建额外的 array_agg。

我希望得到这样的东西:

在此处输入图像描述

反正有做这样的事情吗?你会怎么做?

谢谢你。

标签: google-bigquery

解决方案


我会在原始查询周围包装另一个 SELECT ... GROUP BY,例如:

WITH sample as (
SELECT 1023 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'chop' as type,
'c1023' as id_cus, 'julian' as name, '12345' as phone, 'julian@gmail.com' as    mail
UNION ALL
SELECT 1023 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'ext' as type,
'c1023' as id_cus, 'julian' as name, '12345' as phone, 'julian@gmail.com' as    mail
UNION ALL
SELECT 1021 as id,10 as valuation,'inegi' as    origin, 2021-01-01 as date_lead,'ext' as type,
'in-2020' as id_cus, 'lucian' as name, '12345' as phone, 'lucian@gmail.com' as  mail
UNION ALL
SELECT 1021 as id,10 as valuation,'inegi' as    origin, 2021-01-01 as date_lead,'ext' as type,
'in-2020' as id_cus, 'lucian' as name, '12345' as phone, 'lucian@gmail.com' as  mail
UNION ALL
SELECT 1021 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'int' as type,
'c1021' as id_cus, 'lucian' as name, '12345' as phone, 'lucian@gmail.com' as    mail
UNION ALL
SELECT 1021 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'int' as type,
'c1021' as id_cus, 'lucas' as name, '202342' as phone, 'lucas@gmail.com' as mail
UNION ALL
SELECT 1040 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'type' as type,
'c1040' as id_cus, 'julieta' as name, '202112' as phone, 'julieta@gmail.com' as mail
UNION ALL
SELECT 1040 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'chop' as type,
'c1040' as id_cus, 'julieta' as name, '202112' as phone, 'julieta@gmail.com' as mail
UNION ALL
SELECT 1040 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'rad' as type,
'c1040' as id_cus, 'julieta' as name, '202112' as phone, 'julieta@gmail.com' as mail
UNION ALL
SELECT 1040 as id,10 as valuation,'tlv' as  origin, 2021-01-01 as date_lead,'uls' as type,
'c1040' as id_cus, 'julieta' as name, '123123' as phone, 'julieta@gmail.com' as mail
)
SELECT id,valuation,ARRAY_AGG(STRUCT(lead,name,phone,mail)) FROM (
    SELECT id,valuation,name,phone,mail, ARRAY_AGG(STRUCT(origin,date_lead,type,id_cus)) as lead 
    FROM sample
    GROUP BY id,valuation,name,phone,mail
)
GROUP BY id,valuation

在此处输入图像描述


推荐阅读