首页 > 解决方案 > Postgres:如何优化使用多个 json_array_elements() 调用的查询

问题描述

我有以下查询,它从 JSON 对象(数据类型facebook_results的 Postgres 10 列json)中提取几列数据。

有时,此对象中的数组包含 10,000 多个项目。

这样做的目的是从对象中的每一列获取非规范化数据的平面图,并且在有一个数组的地方我也想获取其中包含对象的所有列(显然只是将数据复制到外部键)。

最里面的键都不包含数组,所以我不需要担心。我只关心应该“扩展”的matchesand数组。nodes

现在查询有效,但速度非常非常慢。我假设是因为编写不佳的查询执行是递归的或具有不必要的复杂性减慢。

SELECT
  id AS slice_id,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'size'       AS match_size,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'score'      AS match_score,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'width'      AS match_width,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'format'     AS match_format,
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'domain'     AS match_domain,
  json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'nodes') -> 'table' -> 'crawl_date' AS node_crawl_date,
  json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' -> 'nodes') -> 'table' -> 'url'        AS node_url
FROM slices
WHERE id = 169

这是列中包含的内容的示例facebook_results

{
  "table":{
    "matches": [
      {  
        "table":{  
          "nodes":[  
            {  
              "table":{  
                "crawl_date":"2013-06-21",
                "url":"http://example.com"
              }
            }
          ],
          "size":7962624,
          "score":47.059,
          "width":3456,
          "format":"MP4",
          "domain":"example.com"
        }
      }
    ]
  }
}

有谁知道我如何优化这个?

标签: postgresql

解决方案


您可以使用以下方法重写您的查询LATERAL

SELECT
  id AS slice_id,
  s.t -> 'size'       AS match_size,
  s.t -> 'score'      AS match_score,
  s.t -> 'width'      AS match_width,
  s.t -> 'format'     AS match_format,
  s.t -> 'domain'     AS match_domain,
  s.t2-> 'crawl_date' AS node_crawl_date,
  s.t2-> 'url'        AS node_url
FROM slices
,LATERAL (
SELECT json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table',
json_array_elements(json_array_elements(facebook_results -> 'table' -> 'matches') 
           -> 'table' -> 'nodes') -> 'table') s(t,t2)
WHERE id = 169;

DBFiddle 演示

甚至更短:

SELECT
  id AS slice_id,
  s.t   -> 'size'       AS match_size,
  s.t   -> 'score'      AS match_score,
  s.t   -> 'width'      AS match_width,
  s.t   -> 'format'     AS match_format,
  s.t   -> 'domain'     AS match_domain,
  s2.t2 -> 'crawl_date' AS node_crawl_date,
  s2.t2 -> 'url'        AS node_url
FROM slices
,LATERAL(SELECT 
  json_array_elements(facebook_results -> 'table' -> 'matches') -> 'table' ) s(t)
,LATERAL(SELECT json_array_elements(s.t -> 'nodes') -> 'table') s2(t2)
WHERE id = 169;

DBFiddle Demo2


推荐阅读