首页 > 解决方案 > 在 BigQuery 中对串联字符串(行)进行分组

问题描述

我正在使用 Google BigQuery 并且我有一个如下所示的查询:

 SELECT
        prod.abc
        uniqueid,
        variable2,
        cust.variable1,
        purch.variable2,
        from mydata.order
        left join
        UNNEST(purchases) as purch,
        UNNEST(codes_abs) as cod, UNNEST(cod.try_products) as prod

当我这样做时,这会产生一个如下所示的表:

    |prod.abc| uniqueid | variable2 | ...|
    |APP123  | customer1| value     | ...|
    |BLU155  | customer1| value     | ...|
    |TRI134  | customer1| value     | ...|
    |LO123   | customer2| value     | ...|
    |ZU9274  | customer2| value     | ...|
    |TO134   | customer3| value     | ...|

我想做的是连接列“prod.abc”中的值,按“uniqueid”对它们进行分组,并用“,”分隔它们。但是,我在网上找到了许多解决方案,因为我在查询中取消了其他变量,所以我找到的解决方案似乎都不适用于我的情况。这些值不需要以任何方式排序。基本上,我想结束的是:

    |prod.abc                  | uniqueid | variable2 | ...|
    |APP123, BLU155, TRI134    | customer1| value     | ...|
    |LO123, ZU9274             | customer2| value     | ...|
    |TO134                     | customer3| value     | ...|

也可以得到一个这样的表来保存重复项,因为我以后可以删除它们:

|prod.abc                  | uniqueid | variable2 | ...|
|APP123, BLU155, TRI134    | customer1| value     | ...|
|APP123, BLU155, TRI134    | customer1| value     | ...|
|APP123, BLU155, TRI134    | customer1| value     | ...|
|LO123, ZU9274             | customer2| value     | ...|
|LO123, ZU9274             | customer2| value     | ...|
|TO134                     | customer3| value     | ...|

任何帮助深表感谢。谢谢!

标签: sqlgroup-bygoogle-bigqueryconcatenationstring-concatenation

解决方案


分别做每个 unnest:聚合是否有效?

SELECT STRING_AGG(item.abc, ',')
       uniqueid, variable2, cust.variable1, purch.variable2
FROM mydata.order LEFT JOIN
     UNNEST(purchases) as purch
     ON true LEFT JOIN
     UNNEST(codes_abs) as cod
     ON true LEFT JOIN
     UNNEST(cod.try_items) as item
     ON true
GROUP BY uniqueid, variable2, cust.variable1, purch.variable2;

推荐阅读