首页 > 解决方案 > 如何为 BigQuery 标准 SQL 案例语句创建多个“THEN”子句?

问题描述

我在 BigQuery 上使用标准 SQL 根据现有表中的某些条件创建新表。我有多个 WHEN 子句来支持这一点(因为我正在检查几个不同的条件)。我现在想做的是在这些 WHEN 语句中有多个 THEN 子句,因为我的目标是添加多个列。

具体来说,我想将两个现有文本字段的串联添加为一个字段,然后将三个现有字段的聚合数组添加为一个字段:

CASE WHEN
    # all three match
    one_x1 = two_x1 = three_x1 THEN CONCAT( object1_name, ", ", object2_name, ", ", object3_name ) AND ARRAY_AGG(STRUCT(score_one, score_two, score_three))
    # one and two match
      WHEN one_x1 = two_x1 THEN CONCAT( object1_name, ", ", object2_name ) AND ARRAY_AGG(STRUCT(score_one, score_two))
    # one and three match
      WHEN one_x1 = three_x1 THEN CONCAT( object1_name, ", ", object3_name ) AND ARRAY_AGG(STRUCT(score_one, score_three))
    # two and three match
      WHEN two_x1 = three_x1 THEN CONCAT( object2_name, ", ", object3_name ) AND ARRAY_AGG(STRUCT(score_two, score_three))
   ELSE
    NULL
 END

'AND ARRAY_AGG(STRUCT(xxxxx))' 部分不起作用,我也尝试使用逗号分隔 THEN 子句。

重复相同的 case 语句以单独分隔 THEN 子句的唯一选择是什么?

样本数据: 样本数据第 1 行的sample_data 期望结果: 这里

标签: sqlgoogle-bigqueryconditional-statementscase-statementbigquery-standard-sql

解决方案


以下是 BigQuery 标准 SQL

首先,让我们更正您的初始查询,以便它实际产生预期的结果

#standardSQL
SELECT id, 
CASE 
    WHEN one_x1 = two_x1 AND one_x1 = three_x1 THEN CONCAT( object1_name, ", ", object2_name, ", ", object3_name )
    WHEN one_x1 = two_x1 THEN CONCAT( object1_name, ", ", object2_name )
    WHEN one_x1 = three_x1 THEN CONCAT( object1_name, ", ", object3_name )
    WHEN two_x1 = three_x1 THEN CONCAT( object2_name, ", ", object3_name )
    ELSE NULL
END AS field1,
CASE 
    WHEN one_x1 = two_x1 AND one_x1 = three_x1 THEN [score_one, score_two, score_three]
    WHEN one_x1 = two_x1 THEN [score_one, score_two]
    WHEN one_x1 = three_x1 THEN [score_one, score_three]
    WHEN two_x1 = three_x1 THEN [score_two, score_three]
    ELSE NULL
END AS field2
FROM `project.dataset.table`

如果适用于您的问题的样本数据 - 结果是

Row id  field1                  field2   
1   1   Dog, Animal             0.82     
                                0.72     
2   2   Horse, Animal, Bird     0.76     
                                0.73     
                                0.9  
3   3   Dog, Animal, Chicken    0.67     
                                0.75     
                                0.65     
4   4   Bird, Chicken           0.87     
                                0.86       

接下来,据我了解,您希望避免在您的 CASE 中一次又一次地重复相同的条件-为此-您可以使用以下技巧

#standardSQL

SELECT id, fields.* FROM (
  SELECT id, 
  CASE 
      WHEN one_x1 = two_x1 AND one_x1 = three_x1 THEN 
        STRUCT(CONCAT( object1_name, ", ", object2_name, ", ", object3_name) AS field1, [score_one, score_two, score_three] AS field2)
      WHEN one_x1 = two_x1 THEN 
        STRUCT(CONCAT( object1_name, ", ", object2_name ) AS field1, [score_one, score_two] AS field2)
      WHEN one_x1 = three_x1 THEN 
        STRUCT(CONCAT( object1_name, ", ", object3_name ) AS field1, [score_one, score_three] AS field2)
      WHEN two_x1 = three_x1 THEN 
        STRUCT(CONCAT( object2_name, ", ", object3_name ) AS field1, [score_two, score_three] AS field2)
      ELSE NULL
  END AS fields
  FROM `project.dataset.table`
)

显然具有相同的输出...

最后,作为 yo 的另一种选择 - 您可以使用以下方法消除所有这些情况/时间/然后

#standardSQL
SELECT id, 
  (SELECT STRING_AGG(object) FROM UNNEST(objects) object WITH OFFSET
    JOIN UNNEST(pos) OFFSET USING(OFFSET)
  ) field1,
    (SELECT ARRAY_AGG(score) FROM UNNEST(scores) score WITH OFFSET
    JOIN UNNEST(pos) OFFSET USING(OFFSET)
  ) field2
FROM (
  SELECT id, 
    [object1_name, object2_name, object3_name] objects,
    [score_one, score_two, score_three] scores,
    (SELECT ARRAY_AGG(OFFSET) 
      FROM UNNEST([one_x1, two_x1, three_x1]) x WITH OFFSET 
      GROUP BY x HAVING COUNT(1) > 1
    ) pos
  FROM `project.dataset.table`
)

再次具有相同的输出


推荐阅读