首页 > 解决方案 > 使用 JQ 解析 JSON 嵌套对象,使用 select 匹配嵌套对象中的键值,同时显示现有结构

问题描述

使用 JQ 解析 JSON 嵌套对象,使用 select 匹配嵌套对象中的键值,同时显示现有结构

我正在尝试获取一个包含 20,000 多行的复杂 JSON 文件并提取特定密钥,同时保留周围的元数据,从而增加必要的人类可理解的上下文。


数据源(复杂结构):

{
  "Marketplace": [
    {
      "Level1Name": "Company A Products",
      "Level1Array": [
        {
          "Level2Name": "USA Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://a.com/products"
            },
            {
              "Level3Name": "Subset1001",
              "Level3URL": "https://a.com/products/subset1001"
            }
          ]
        }
      ]
    },
    {
      "Level1Name": "Company B Products",
      "Level1Array": [
        {
          "Level2Name": "USA Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://b.com/products"
            },
            {
              "Level3Name": "Subset500",
              "Level3URL": "https://b.com/products/subset500"
            }
          ]
        },
        {
          "Level2Name": "EU Products List",
          "Level2Contents": [
            {
              "Level3Name": "ALL",
              "Level3URL": "https://b.eu/products"
            },
            {
              "Level3Name": "Subset200",
              "Level3URL": "https://b.eu/products/subset200"
            }
          ]
        }
      ]
    },
    {
      "Level1Name": "Company X Products",
      "Level1Array": [
        {
          "Level2Name": "Deleted Products",
          "Level2URL": "https://internal.x.com/products"
        }
      ]
    }
  ]
}

当前用于提取的 JQ 命令会删除所有其他上下文元数据...

jq -r '(
         .Marketplace[].Level1Array[].Level2Contents[]
         | select (.Level3Name | index("ALL"))
         | [.]
         )'

输出给定...

[
  {
    "Level3Name": "ALL",
    "Level3URL": "https://a.com/products"
  }
]
[
  {
    "Level3Name": "ALL",
    "Level3URL": "https://b.com/products"
  }
]
[
  {
    "Level3Name": "ALL",
    "Level3URL": "https://b.eu/products"
  }
]

选项 1 输出所需的相同 JSON 结构,并删除了与选择过滤器“ALL”字符串条件不匹配的所有其他对象

{
    "Marketplace":
  [
        {
            "Level1Name": "Company A Products",
            "Level1Array": [
                {
                    "Level2Name": "USA Products List",
                    "Level2Contents": [
                        {
                            "Level3Name": "ALL",
                            "Level3URL": "https://a.com/products"
                        }
                    ]
                }
            ]
        },
        {
            "Level1Name": "Company B Products",
            "Level1Array": [
                {
                    "Level2Name": "USA Products List",
                    "Level2Contents": [
                        {
                            "Level3Name": "ALL",
                            "Level3URL": "https://b.com/products"
                        }
                    ]
                },
                {
                    "Level2Name": "EU Products List",
                    "Level2Contents": [
                        {
                            "Level3Name": "ALL",
                            "Level3URL": "https://b.eu/products"
                        }
                    ]
                }
            ]
        }
    ]
}

所需的选项 2 输出,可以使用循环迭代的任何类似格式,例如:

{
  "Marketplace":
  [
    {
      "Level1Name": "Company A Products",
      "Level2Name": "USA Products List",
      "Level3Name": "ALL",
      "Level3URL": "https://a.com/products"
    },
    {
      "Level1Name": "Company B Products",
      "Level2Name": "USA Products List",
      "Level3Name": "ALL",
      "Level3URL": "https://b.com/products"
    },
    {
      "Level1Name": "Company B Products",
      "Level2Name": "EU Products List",
      "Level3Name": "ALL",
      "Level3URL": "https://b.eu/products"
    }
  ]
}

标签: jsonnestedjq

解决方案


以下过滤器产生“选项 2”输出:

.Marketplace |= map(
  {Level1Name} as $Level1Name
  | .Level1Array[]
  | {Level2Name} as $Level2Name
  | .Level2Contents[]?
  | select(.Level3Name == "ALL")
  | $Level1Name + $Level2Name + . )

打破它...

理解这一点的一种方法是考虑:

.Marketplace[]
| {Level1Name} as $Level1Name
| .Level1Array[]
| {Level2Name} as $Level2Name
| .Level2Contents[]?             # in case .Level2Contents is missing
| if (.Level3Name == "ALL")
  then $Level1Name + $Level2Name + .
  else empty
  end

附录:“姓名”

OP随后询问如果三个级别的“名称”键都命名为“名称”,可以做什么。通过调整上述内容可以很容易地获得答案,以产生:

.Marketplace |= map(
  {Level1Name: .Name} as $Level1Name
  | .Level1Array[]
  | {Level2Name: .Name} as $Level2Name
  | .Level2Contents[]?
  | select(.Name == "ALL")
  | $Level1Name + $Level2Name + . )

输出

在这种情况下,输出将如下所示:

{
  "Marketplace": [
    {
      "Level1Name": "Company A Products",
      "Level2Name": "USA Products List",
      "Name": "ALL",
      "Level3URL": "https://a.com/products"
    },
    {
      "Level1Name": "Company B Products",
      "Level2Name": "USA Products List",
      "Name": "ALL",
      "Level3URL": "https://b.com/products"
    },
    {
      "Level1Name": "Company B Products",
      "Level2Name": "EU Products List",
      "Name": "ALL",
      "Level3URL": "https://b.eu/products"
    }
  ]
}

推荐阅读