sql - 在 Presto 中按数组值分组
问题描述
有一张像这样的电影表:
genre revenue
array<string> int
---------------------------
[Drama] 10000
[Drama, Fiction] 2000
[Fiction] 300
[Comedy] 5000
[Comedy, Fiction] 500
如何按包含的类型对电影进行分组并执行聚合,例如:
genre count_movies sum_revenue
---------------------------------------
Drama 2 12000
Fiction 3 2800
Comedy 2 5500
我查看了数组函数和运算符的文档,但在那个方向上找不到任何东西。
负责此输出的查询可能如下所示:
SELECT
COUNT(*) AS count_movies,
SUM(revenue) AS sum_revenue
FROM movies
GROUP BY ARRAY_EXPLODE(genre) -- this is fake, just to illustrate the point
解决方案
你可以使用UNNEST
:
-- pseudocode
SELECT t.genre,
COUNT(*) AS count_movies,
SUM(revenue) AS sum_revenue
FROM movies
CROSS JOIN UNNEST(genre) AS t(genre)
GROUP BY t.genre