首页 > 解决方案 > BigQuery 取消嵌套数组 - 获取重复项

问题描述

我正在处理 BQ 中的 GCP 计费查询。但是,在使用成本提取数组时,我得到了错误的值,例如 unnest 返回行格式的数组元素。因此,如果我在单行数组中有 2 个元素,那么我将得到 2 行。

例如:

实际数组:

SELECT

TO_JSON_STRING(labels), cost

FROM

billing_export.gcp_billing_export

WHERE

_PARTITIONTIME >= "2018-08-01 00:00:00"

AND _PARTITIONTIME < "2018-09-01 00:00:00"

AND billing_account_id = "xxx-62378F-xxx"

AND TO_JSON_STRING(labels) = '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]'

and cost> 0 limit 10

在此处输入图像描述

使用 Unnest:

with cte as (SELECT

labels, cost

FROM

billing_export.gcp_billing_export

WHERE

_PARTITIONTIME >= "2018-08-01 00:00:00"

AND _PARTITIONTIME < "2018-09-01 00:00:00"

AND billing_account_id = "xxx-62378F-xxxx"

AND TO_JSON_STRING(labels) = '[{"key":"application","value":"scaled-server"},{"key":"department","value":"hrd"}]'

and cost> 0

limit 10 )

select labels,cost from cte ,

UNNEST(labels) AS la

在此处输入图像描述

问题:

我不想要重复的成本值,有人可以帮我解决这个问题吗?

标签: arraysgoogle-cloud-platformgoogle-bigqueryunnest

解决方案


代替

SELECT labels,cost from cte ,
UNNEST(labels) AS la   

尝试

SELECT la, cost from cte ,
UNNEST(labels) AS la   

更新

SELECT 
  ARRAY(
    SELECT AS STRUCT 
      JSON_EXTRACT_SCALAR(kv, '$.key') key, 
      JSON_EXTRACT_SCALAR(kv, '$.value') value 
    FROM UNNEST(SPLIT(labels, '},{')) kv_temp, 
    UNNEST([CONCAT('{', REGEXP_REPLACE(kv_temp, r'^\[{|}]$', ''), '}')]) kv
  ) labels,
  cost
FROM cte

推荐阅读