首页 > 解决方案 > BigQuery:如何从重复记录中仅提取某些字段作为另一个重复字段

问题描述

以下是 BigQuery 中的示例表:

WITH test AS (
  SELECT
    [ 
      STRUCT("Rudisha" as name, 123 as id),
      STRUCT("Murphy" as name, 124 as id),
      STRUCT("Bosse" as name, 125 as id),
      STRUCT("Rotich" as name,  126 as id)
    ] AS data

    UNION

    [
      STRUCT("Lewandowski" as name, 127 as id),
      STRUCT("Kipketer" as name, 128 as id),
      STRUCT("Berian" as name, 129 as id)
    ] AS data
)

在这里,我想将记录字段(“数据”)中的“id”字段提取为可重复字段。所以行数将保持不变,但只有 ids 字段具有重复类型:

ids: [123, 124, 125, 126]
ids: [127, 128, 129]

我怎样才能做到这一点?

标签: google-bigquery

解决方案


以下是 BigQuery 标准 SQL

#standardSQL
WITH test AS (
  SELECT
    [ 
      STRUCT("Rudisha" AS name, 123 AS id),
      STRUCT("Murphy" AS name, 124 AS id),
      STRUCT("Bosse" AS name, 125 AS id),
      STRUCT("Rotich" AS name,  126 AS id)
    ] AS data
    UNION ALL SELECT
    [
      STRUCT("Lewandowski" AS name, 127 AS id),
      STRUCT("Kipketer" AS name, 128 AS id),
      STRUCT("Berian" AS name, 129 AS id)
    ] AS data
)
SELECT ARRAY(SELECT id FROM UNNEST(data)) ids
FROM test

推荐阅读