首页 > 解决方案 > 如何根据其他表中没有公共字段的信息进行计算?

问题描述

我对 GBQ 的世界还很陌生,我不确定如何最好地解释我的情况,但这里是我目前正在使用的 3 个表的示例:

在此处输入图像描述

我正在尝试在“product_type”表中添加一个新列,其中包含客户从“delivery_1”表中订购的所有产品的计数,并且不太确定如何执行此操作,因为没有任何公共字段。

这是我的结果的可视化:

在此处输入图像描述

以下是创建示例表的查询:

WITH customers_orders AS (
    SELECT '00001' customer_no, 'yes' product_a, 'no' product_b, 'yes' product_c UNION ALL
    SELECT '00002' customer_no, 'yes' product_a, 'yes' product_b, 'no' product_c UNION ALL
    SELECT '00003' customer_no, 'no' product_a, 'no' product_b, 'no' product_c UNION ALL
    SELECT '00004' customer_no, 'yes' product_a, 'yes' product_b, 'no' product_c UNION ALL
    SELECT '00005' customer_no, 'yes' product_a, 'yes' product_b, 'yes' product_c
    )

WITH product_type AS (
    SELECT 'product_a' product, 'export' type UNION ALL
    SELECT 'product_b' product, 'import' type UNION ALL
    SELECT 'product_c' product, 'import' type
    )

WITH delivery_1 AS (
    SELECT '00001' customer_no, 'delivery_1' delivery UNION ALL
    SELECT '00002' customer_no, 'delivery_1' delivery UNION ALL
    SELECT '00005' customer_no, 'delivery_1' delivery
    )

非常感谢任何提示或帮助!

标签: sqlgoogle-bigquery

解决方案


以下是 BigQuery 标准 SQL

#standardSQL
SELECT product, type, delivery_1_total_ordered
FROM `project.dataset.product_type`
LEFT JOIN (
  SELECT TRIM(SPLIT(kv, ':')[OFFSET(0)], '"') product,
    COUNT(1) delivery_1_total_ordered
  FROM `project.dataset.customers_orders`
  JOIN `project.dataset.delivery_1` 
  USING(customer_no)
  CROSS JOIN UNNEST(SPLIT(TRIM(TO_JSON_STRING(STRUCT(product_a, product_b, product_c)), '{}'))) kv

  WHERE SPLIT(kv, ':')[OFFSET(1)] = '"yes"'
  GROUP BY product
)
USING(product)   

如果适用于您的问题的样本数据 - 输出是

Row product         type        delivery_1_total_ordered     
1   product_a       export      3    
2   product_b       import      2    
3   product_c       import      2   

有没有办法选择所有产品类型而不是手动输入它们?

当然。见下文略有调整的查询

#standardSQL
SELECT product, type, delivery_1_total_ordered
FROM `project.dataset.product_type`
LEFT JOIN (
  SELECT TRIM(SPLIT(kv, ':')[OFFSET(0)], '"') product,
    COUNT(1) delivery_1_total_ordered
  FROM `project.dataset.customers_orders` t   /* added alias */
  JOIN `project.dataset.delivery_1` 
  USING(customer_no)
  CROSS JOIN UNNEST(SPLIT(TRIM(TO_JSON_STRING(t), '{}'))) kv   /* used alias instead of explicit list of products */
  WHERE SPLIT(kv, ':')[OFFSET(1)] = '"yes"'
  GROUP BY product
)
USING(product)

推荐阅读