首页 > 解决方案 > 从 JSONB 字段中正确提取 JSON 数组

问题描述

从 PostgreSQL 10 中的表中,我试图将同一jsonb字段的多个子项中的所有数组元素加入到它们的父行中,有点像this questionthis one。但是我犯了一个错误,JOIN即我没有获取单个数组元素,而是将单个数组元素包装在一个单项数组中。

这是缩写表定义:

CREATE TABLE public.worker_customformstore (
    id integer NOT NULL DEFAULT nextval('worker_customformstore_id_seq'::regclass),
    created_on timestamp with time zone NOT NULL,
    store jsonb,
    schema_id integer NOT NULL,
    polymorphic_ctype_id integer,
    pdf_key character varying(100) COLLATE pg_catalog."default" NOT NULL,
    last_updated timestamp with time zone
)

以及该字段的示例值store

'{"Subcontractor Use": {
        "labor": [
            {
                "note": null,
                "hours": {
                    "dt": null,
                    "ot": null,
                    "st": 1,
                    "pdt": null,
                    "pot": null
                },
                "employee": {
                    "id": 456,
                    "trade": "XXX",
                    "is_active": true,
                    "last_name": "Uknow",
                    "first_name": "Noone",
                    "company_supplied_id": "456"
                },
                "external subcontractor": false
            },
            {
                "note": null,
                "hours": {
                    "dt": null,
                    "ot": null,
                    "st": 8,
                    "pdt": null,
                    "pot": null
                },
                "employee": {
                    "id": 123,
                    "trade": "",
                    "member": null,
                    "is_active": true,
                    "last_name": "Guy",
                    "user_role": "WORKER",
                    "first_name": "Some",
                    "company_supplied_id": "123"
                },
                "external subcontractor": false
            }
        ],
        "Equipment": [
            {
                "note": null,
                "hours": {
                    "idle": null,
                    "over": null,
                    "running": 8
                },
                "quantity": 1,
                "equipment": {
                    "id": 6243,
                    "status": "Rented",
                    "project": "8399",
                    "category": "XXXXX",
                    "caltrans_id": "00-20",
                    "description": "19",
                    "equipment_id": "Scissor",
                    "idle_time_price": 0,
                    "over_time_price": 0,
                    "running_time_price": 0
                }
            }
        ]
    }
}'

我的简化查询如下所示:

SELECT 
cufstore.id, 
CASE
    WHEN labor IS NOT DISTINCT FROM NULL THEN
    0
    WHEN (jsonb_array_elements(labor) -> 'hours' ->> 'st') = '' THEN
    0
    ELSE
    COALESCE((jsonb_array_elements(labor) -> 'hours' ->> 'st')::numeric, 0)
END
-- more stuff here ...
as total_hours,

CASE
    WHEN labor IS NOT DISTINCT FROM NULL THEN
    0
    ELSE
    COALESCE(jsonb_array_length(cufstore.store -> 'Subcontractor Use' -> 'labor'), 0)
END as total_workers,

labor, equipment

FROM public.worker_customformstore AS cufstore
...

LEFT OUTER JOIN LATERAL 
    (SELECT
        jsonb_array_elements(jsonb_strip_nulls(cufstore.store -> 'Subcontractor Use' -> 'labor'))
        WHERE cufstore.store -> 'Subcontractor Use' ->> 'labor' IS NOT NULL
    ) labor on true

LEFT OUTER JOIN LATERAL 
    (SELECT
        jsonb_array_elements(jsonb_strip_nulls(cufstore.store -> 'Subcontractor Use' -> 'Equipment'))
        WHERE cufstore.store -> 'Subcontractor Use' ->> 'Equipment' IS NOT NULL
    ) equipment on true

除了结束大量冗余jsonb_array_elements调用之外,这些还阻止我将重复的逻辑重构为函数,因为我在函数定义中收到关于 aCOALESCE中的集合返回函数的错误(尽管当它发生在我的查询)。

我想我想要的更像是:

LEFT OUTER JOIN LATERAL 
    jsonb_array_elements(jsonb_strip_nulls(cufstore.store -> 'Subcontractor Use' -> 'labor')) labor
    ON jsonb_typeof(labor) = 'array'

cannot extract elements from a scalar但是,当数据NULL看起来正确或不正确时,尝试这样做会给我。

我可能从根本上误解了我能做什么,但这就是equipment专栏的样子:

("{""hours"": {""running"": 8}, ""quantity"": 1, . . .}")

而且我希望能够询问equipment -> 'hours' ->> 'running'而不必将其包装在jsonb_array_elements(equipment). 我需要这样做还是我不小心在列值的开头和结尾添加了括号?

标签: sqlarraysjsonpostgresqljsonb

解决方案


尚不清楚两个嵌套 JSON 数组的元素如何"labor"相关"Equipment"。从您的样本看来,它似乎"Equipment"只有一个元素,而数组包装器只是噪音......

不幸的是,还有一个嵌套的 key "equipment",容易与其他混淆。

我也不清楚确切的目标是什么。

尽管如此,在消除了很多噪音和不必要的复杂性之后,这可能接近您所追求的:

SELECT s.id
     , COALESCE((NULLIF(labor->'hours'->>'st', ''))::numeric, 0) AS total_hours
     , CASE WHEN labor IS NULL THEN 0
            ELSE COALESCE(jsonb_array_length(s.store->'Subcontractor Use'->'labor'), 0)
       END AS total_workers
     , s.store #>> '{Subcontractor Use, Equipment, 0, hours, running}' AS equipment_hours
     , labor
FROM   worker_customformstore s
LEFT   JOIN jsonb_array_elements(s.store->'Subcontractor Use'->'labor') labor ON true;

db<>在这里摆弄

笔记

这个冗长的表达:

CASE
    WHEN labor IS NOT DISTINCT FROM NULL THEN
    0
    WHEN (jsonb_array_elements(labor) -> 'hours' ->> 'st') = '' THEN
    0
    ELSE
    COALESCE((jsonb_array_elements(labor) -> 'hours' ->> 'st')::numeric, 0)
END

归结为:

COALESCE((NULLIF(labor -> 'hours' ->> 'st', ''))::numeric, 0)
  • 不要jsonb_array_elements()再次申请,这已经在横向子查询中完成了。

  • labor IS NOT DISTINCT FROM NULL与 相同labor IS NULL,但我们不需要任何一个,因为后者COALESCE无论如何都需要它。

  • 使用NULLIF我们根本不需要CASE另一个分支。

假设嵌套 JSON 数组中只有一个元素"Equipment",我们可以equipment_hours直接使用 s.store #>> '{Subcontractor Use, Equipment, 0, hours, running}'. 如果假设不成立,您将不得不做更多(并解释更多)。


处理您的评论

如果 store -> 'Subcontractor Use' -> 'labor'不是嵌套的 JSON 数组,而是,例如,一个标量,你会得到一个错误,就像你评论的那样:

ERROR: cannot extract elements from a scalar

db<>在这里摆弄

您可能会避免使用嵌套的异常CASE

...
LEFT   JOIN jsonb_array_elements(
          CASE WHEN jsonb_typeof(s.store -> 'Subcontractor Use' -> 'labor') = 'array'
               THEN              s.store -> 'Subcontractor Use' -> 'labor'
          END) labor ON true;

db<>在这里摆弄

您可能想要做更多的事情来返回案例的替代值......


推荐阅读