首页 > 解决方案 > 在 Presto 中按数组值分组

问题描述

有一张像这样的电影表:

genre               revenue
array<string>           int
---------------------------
[Drama]               10000
[Drama, Fiction]       2000
[Fiction]               300
[Comedy]               5000
[Comedy, Fiction]       500

如何按包含的类型对电影进行分组并执行聚合,例如:

genre       count_movies    sum_revenue
---------------------------------------
Drama                  2          12000
Fiction                3           2800
Comedy                 2           5500

我查看了数组函数和运算符的文档,但在那个方向上找不到任何东西。

负责此输出的查询可能如下所示:

SELECT
  COUNT(*) AS count_movies,
  SUM(revenue) AS sum_revenue
FROM movies
GROUP BY ARRAY_EXPLODE(genre) -- this is fake, just to illustrate the point

标签: sqlgroup-bypresto

解决方案


你可以使用UNNEST

-- pseudocode
SELECT t.genre,
  COUNT(*) AS count_movies,
  SUM(revenue) AS sum_revenue
FROM movies
CROSS JOIN UNNEST(genre) AS t(genre)
GROUP BY t.genre

推荐阅读