首页 > 解决方案 > 在 Bigquery 中,如何将结构的字符串化数组转换为正确的数组?

问题描述

我想将数据集中每个表的标签存储到包含 (table_name, label_name, label_value) 的表中,以便我可以选择带有标签条件的表名(例如, a WHERE label_name = 'xxx' and label_value = 'yyy')。

通过这个查询,我可以提取一个字符串化的结构数组,每个都包含标签名称和标签值:

*
FROM `mydataset.INFORMATION_SCHEMA.TABLE_OPTIONS`
WHERE option_name = 'labels'

输出如下所示:

WITH output AS
( 
SELECT "table_name_a" as table_name, "ARRAY<STRUCT<STRING, STRING>>" AS option_type, "[STRUCT(\"label_name_a_1\", \"label_value_a_1\"), STRUCT(\"label_name_a_2\", \"label_value_a_2\"), STRUCT(\"label_name_a_3\", \"label_value_a_3\")]" as option_value
UNION ALL
SELECT "table_name_b" as table_name, "ARRAY<STRUCT<STRING, STRING>>" AS option_type, "[STRUCT(\"label_name_b_1\", \"label_value_b_1\"), STRUCT(\"label_name_b_2\", \"label_value_b_2\"), STRUCT(\"label_name_b_3\", \"label_value_b_3\")]" as option_value
)

SELECT * FROM output

直观的方法是转换为数组并指定结构:

SELECT ARRAY<STRUCT<STRING, STRING>>[STRUCT("label_name_a_1", "label_value_a_1"), STRUCT("label_name_a_2", "label_value_a_2"), STRUCT("label_name_a_3", "label_value_a_3")]

但我找不到为每个 option_value “动态”执行此操作的方法

一个非常糟糕的解决方案是转换字符串然后使用 JSON_EXTRACT 但我认为可能有更好更简单的方法来做到这一点?

谢谢!

标签: google-bigquery

解决方案


这是将选项值字符串转换为ARRAY<STRUCT<key STRING, value STRING>>类型的代码示例:

DECLARE optionValue DEFAULT ((select option_value
FROM `zyun.INFORMATION_SCHEMA.TABLE_OPTIONS`
WHERE option_name = 'labels'
  AND table_name = 'test'));

DECLARE label ARRAY<STRUCT<key STRING, value STRING>>;
EXECUTE IMMEDIATE "SELECT ARRAY<STRUCT<key STRING, value STRING>> " || optionValue INTO label;

SELECT label;

输出:

SELECT label; -- at [9:1]
+-----------------------------------------------------------+
|                           label                           |
+-----------------------------------------------------------+
| [{"key":"aaa","value":"bbb"},{"key":"ccc","value":"ddd"}] |
+-----------------------------------------------------------+

推荐阅读