google-bigquery - 相对于 BigQuery 中的另一个数组列对一个数组列进行排序
问题描述
我在 Bigquery 中有下表 -
WITH results AS
(SELECT 1 as customerid, ["apples", "bananas", "grapes","orange"] as fruit_array, [0.1,0.4,0.3,0.2] as probability
UNION ALL
SELECT 2 as customerid, ["apples", "bananas", "grapes","orange"] as fruit_array, [0.2,0.1,0.6,0.1] as probability
UNION ALL
SELECT 3 as customerid, ["apples", "bananas", "grapes","orange"] as fruit_array, [0.5,0.05,0.35,0.1] as probability
)
select * from results
在这里,每个顾客都有一定的概率购买一个水果。我想top 2
为每个客户和他们相应probabilities
的购买买水果。
有类似这样的输出会很好 -
customerid, fruits, probability
1, bananas, 0.4
1, grapes, 0.3
..
在上面的最终结果中,因为customerid 1
我只拿起bananas
并且grapes
因为这两种水果的购买概率最高(来自[0.1,0.4,0.3,0.2]
)
有没有我可以在 BiqQuery 中使用的功能来实现这一点?
解决方案
以下是 BigQuery 标准 SQL
#standardSQL
WITH results AS (
SELECT 1 AS customerid, ["apples", "bananas", "grapes","orange"] AS fruit_array, [0.1,0.4,0.3,0.2] AS probability UNION ALL
SELECT 2 AS customerid, ["apples", "bananas", "grapes","orange"] AS fruit_array, [0.2,0.1,0.6,0.1] AS probability UNION ALL
SELECT 3 AS customerid, ["apples", "bananas", "grapes","orange"] AS fruit_array, [0.5,0.05,0.35,0.1] AS probability
)
SELECT customerid, fruit, probability
FROM (
SELECT customerid, ARRAY_AGG(STRUCT(fruit, probability) ORDER BY probability DESC LIMIT 2) top
FROM results,
UNNEST(probability) probability WITH OFFSET off1
JOIN UNNEST(fruit_array) fruit WITH OFFSET off2
ON off1 = off2
GROUP BY customerid
), UNNEST(top)
结果
Row customerid fruit probability
1 1 bananas 0.4
2 1 grapes 0.3
3 2 grapes 0.6
4 2 apples 0.2
5 3 apples 0.5
6 3 grapes 0.35
或者可能是更好的选择
#standardSQL
WITH results AS (
SELECT 1 AS customerid, ["apples", "bananas", "grapes","orange"] AS fruit_array, [0.1,0.4,0.3,0.2] AS probability UNION ALL
SELECT 2 AS customerid, ["apples", "bananas", "grapes","orange"] AS fruit_array, [0.2,0.1,0.6,0.1] AS probability UNION ALL
SELECT 3 AS customerid, ["apples", "bananas", "grapes","orange"] AS fruit_array, [0.5,0.05,0.35,0.1] AS probability
)
SELECT customerid, fruit, probability
FROM (
SELECT customerid,
(
SELECT ARRAY_AGG(STRUCT(fruit, probability) ORDER BY probability DESC LIMIT 2)
FROM UNNEST(probability) probability WITH OFFSET off1
JOIN UNNEST(fruit_array) fruit WITH OFFSET off2
ON off1 = off2
) top
FROM results
), UNNEST(top)
结果相同
推荐阅读
- python - “NoneType”对象没有“查找”属性。当试图用美丽的汤刮
- sql - 数据完整性错误查询修复以重写 sql
- c - 在可变参数函数中使用可变参数函数
- titanium - 如何在 appcelerator Titan 项目中禁用 Ti.UI.iOS.createDocumentViewer 中的上下文菜单?
- javascript - 如何在使用 JS 或 Jquery 悬停时向相应的 div 容器显示文本?
- java - 在我的 Macbook 上安装 PROM 时出现 Java 错误
- java - 将异常从匿名接口传递给方法签名
- python - 用 Python 计算
- android - 编译自定义 rom 时出错我尝试为我的设备编译 android 但没有任何效果
- python-3.x - python 2和python 3中的urlencode方法有区别吗