sql - Want to parse string BLOB as JSON, modify few fields and parse it back to string BIGQUERY
问题描述
So I have a column which contains JSONs as string BLOBs. For example,
{
"a": "a-1",
"b": "b-1",
"c":
{
"c-1": "c-1-1",
"c-2": "c-2-1"
},
"d":
[
{
"k": "v1"
},
{
"k": "v2"
},
{
"k": "v3"
}
]
}
Now my use case is to parse the key b
, hash the value of key b
, assign it back to the key b
and store it back as a string in Bigquery.
I initially tried to do a lazy approach where I am only extracting the key b
using the json_extract_scalar
function in Bigquery and for other keys (like c and d - which I dont want to modify), I used json_extract
function. Then I converted back to string after doing hashing the key b
. Here is the query -
SELECT
TO_JSON_STRING(
STRUCT(
json_EXTRACT(COL_NAME, "$.a") AS a,
MD5(json_extract_scalar(_airbyte_data,"$.b")) AS b,
json_EXTRACT(COL_NAME,"$.c") AS c,
json_EXTRACT(COL_NAME,"$.d") AS d ) )
FROM
dataset.TABLE
But the issue with this query is the JSON objects are getting converted to string and double quotes getting escaped due to TO_JSON_STRING
(I tried using CAST AS STRING on top of STRUCT but it isn't supported). For example, the output row for this query looks like this:
{
"a": "a-1",
"b": "b-1",
"c":
"{
\"c-1\": \"c-1-1\",
\"c-2\": \"c-2-1\"
}",
"d":
"[
{
\"k\": \"v1\"
},
{
\"k\": \"v2\"
},
{
\"k\": \"v3\"
}
]"
}
I can achieve the required output if I use JSON_EXTRACT
and JSON_EXTRACT_SCALAR
functions on every key (and on every nested keys) but this approach isn't scalable as I have close to 200 keys and many of them are nested 2-3 levels deep.
Can anyone suggest a better approach of achieving this? TIA
解决方案
这应该工作
declare _json string default """{
"a": "a-1",
"b": "b-1",
"c":
{
"c-1": "c-1-1",
"c-2": "c-2-1"
},
"d":
[
{
"k": "v1"
},
{
"k": "v2"
},
{
"k": "v3"
}
]
}""";
SELECT regexp_replace(_json, r'"b": "(\w+)-(\w+)"',concat('"b":',TO_JSON_STRING( MD5(json_extract_scalar(_json,"$.b")))))
输出
{
"a": "a-1",
"b":"sdENsgFsL4PBOyX8sXDN6w==",
"c":
{
"c-1": "c-1-1",
"c-2": "c-2-1"
},
"d":
[
{
"k": "v1"
},
{
"k": "v2"
},
{
"k": "v3"
}
]
}
如果您需要特定的正则表达式,请指定 b 值的示例。
推荐阅读
- c# - 调用 HttpClient.PostAsync 时出现死锁
- java - Wildfly 21 - 运行应用程序客户端
- verilog - 检查一段时间内输出信号的值
- node.js - Expo-cli 安装停止并出现错误 npm ERR!cb() 从未调用过
- nibabel - 大标签文件的 nibabel.filebasedimages.ImageFileError
- java - Elasticsearch:按数组属性的长度排序
- vb.net - WebView2 浏览器不加载网页,只显示一个白色窗口
- xcode13 - XCODE 13 ISSUE with app Upload on AppStore Distribution 失败并出现错误
- javascript - Node.js mongoose 在 foreach 循环中查找查询
- powerbi - 使用报表页面在 Power BI 中更改动态文件夹路径