首页 > 解决方案 > 在 BigQuery 中取消嵌套多个列

问题描述

我在处理我正在使用的 UNNEST 查询时遇到问题。下面是一个查询示例,我当前得到的结果以及我希望从中得到的结果。

一点上下文,我目前正在做的上传是将ID'swith间隔A而不是,强制它作为字符串而不是数字,因为在同一个单元格中有多个 ID。也被Price分开了,。正在上传的数据示例:

    Name    |    Date    |   Item_ID   |  Price
    John    |  4/17/2020 | 123A456A678 | 19.99,21.99,30.00
    Joe     |  4/17/2020 | 555A777A888 | 8.99,10.00,15.99
    Jake    |  4/18/2020 |   444A333   | 15.99,9.00
    John    |  4/18/2020 |     432     | 75.99
    Megan   |  4/18/2020 | 12A890A23A99| 5.99,6.99,9.99,10.00

这是尝试 UNNEST 之前表中数据的示例。下面是当前 UNNEST 查询的示例以及输出示例。

With data AS(
  SELECT
    Name,
    Date,
    SPLIT(Item_ID, 'A') AS Item_ID_Split,
    SPLIT(Price, ',') AS Price_Split
FROM
  Example.Table
SELECT
  Name,
  Date,
  Item_ID_Split,
  Price_Split
FROM data,
UNNEST(Item_ID_Split) Item_ID_Split WITH OFFSET pos1
UNNEST(Price_Split) Price_Split WITH OFFSET pos2

当前输出如下所示:

    Name   |    Date   |  Item_ID_Split | Price_Split
    John   | 4/17/2020 |      123       |   19.99
    John   | 4/17/2020 |      456       |   19.99
    John   | 4/17/2020 |      678       |   19.99
    John   | 4/17/2020 |      123       |   21.99
    John   | 4/17/2020 |      456       |   21.99
    John   | 4/17/2020 |      678       |   21.99
    John   | 4/17/2020 |      123       |   30.00
    John   | 4/17/2020 |      456       |   30.00
    John   | 4/17/2020 |      678       |   30.00
    Joe    | 4/17/2020 |      555       |   8.99
    Joe    | 4/17/2020 |      777       |   8.99
    Joe    | 4/17/2020 |      888       |   8.99
    Joe    | 4/17/2020 |      555       |   10.00
    Joe    | 4/17/2020 |      777       |   10.00
    Joe    | 4/17/2020 |      888       |   10.00
    Joe    | 4/17/2020 |      555       |   15.99
    Joe    | 4/17/2020 |      777       |   15.99
    Joe    | 4/17/2020 |      888       |   15.99
    Jake   | 4/18/2020 |      444       |   15.99
    Jake   | 4/18/2020 |      333       |   15.99
    Jake   | 4/18/2020 |      444       |   9.00
    Jake   | 4/18/2020 |      333       |   9.00
    John   | 4/18/2020 |      432       |   75.99
    Megan  | 4/18/2020 |      12        |   5.99
    Megan  | 4/18/2020 |      890       |   5.99
    Megan  | 4/18/2020 |      23        |   5.99
    Megan  | 4/18/2020 |      99        |   5.99
    Megan  | 4/18/2020 |      12        |   6.99
    Megan  | 4/18/2020 |      890       |   6.99
    Megan  | 4/18/2020 |      23        |   6.99
    Megan  | 4/18/2020 |      99        |   6.99
    Megan  | 4/18/2020 |      12        |   9.99
    Megan  | 4/18/2020 |      890       |   9.99
    Megan  | 4/18/2020 |      23        |   9.99
    Megan  | 4/18/2020 |      99        |   9.99
    Megan  | 4/18/2020 |      12        |   10.00
    Megan  | 4/18/2020 |      890       |   10.00
    Megan  | 4/18/2020 |      23        |   10.00
    Megan  | 4/18/2020 |      99        |   10.00

这是上述查询的当前输出。如您所见,有重复的 Item_IDs/Prices,我想要的结果如下:

    Name   |    Date   |  Item_ID_Split | Price_Split
    John   | 4/17/2020 |      123       |   19.99
    John   | 4/17/2020 |      456       |   21.99
    John   | 4/17/2020 |      678       |   30.00
    Joe    | 4/17/2020 |      555       |   8.99
    Joe    | 4/17/2020 |      777       |   10.00
    Joe    | 4/17/2020 |      888       |   15.99
    Jake   | 4/18/2020 |      444       |   15.99
    Jake   | 4/18/2020 |      333       |   9.00
    John   | 4/18/2020 |      432       |   75.99
    Megan  | 4/18/2020 |      12        |   5.99
    Megan  | 4/18/2020 |      890       |   6.99
    Megan  | 4/18/2020 |      23        |   9.99
    Megan  | 4/18/2020 |      99        |   10.00

Item_ID_Split这是我正在寻找的结果,在和之间根本没有重复Price_Split。我试图把SPLIT函数放在里面,UNNEST但我得到了相同的输出。我不完全确定如何做到这一点,所以任何帮助将不胜感激!

先感谢您!

标签: sqlgoogle-bigquery

解决方案


您可以使用with offset

SELECT Name, Date, Item_ID_Split, Price_Split
FROM data LEFT JOIN
     UNNEST(Item_ID_Split) Item_ID_Split WITH OFFSET pos1
     ON 1=1 LEFT JOIN
     UNNEST(Price_Split) Price_Split WITH OFFSET pos2
     ON pos1 = po2;

推荐阅读