首页 > 解决方案 > 使用 JQ 到特定的 csv 格式

问题描述

我有一个看起来像这样的 json:

[
  {
    "auth": 1,
    "status": "Active",
    "userCustomAttributes": [
      {
        "customAttributeName": "Attribute 1",
        "customAttributeValue": "Value 1"
      },
      {
        "customAttributeName": "Attribute 2",
        "customAttributeValue": "Value 2"
      },
      {
        "customAttributeName": "Attribute 3",
        "customAttributeValue": "Value 3"
      }
    ],
  },
  {
    "auth": 1,
    "status": "Active",
    "userCustomAttributes": [
      {
        "customAttributeName": "Attribute 1",
        "customAttributeValue": "Value 1"
      },
      {
        "customAttributeName": "Attribute 2",
        "customAttributeValue": "Value 2"
      },
      {
        "customAttributeName": "Attribute 3",
        "customAttributeValue": "Value 3"
      },
      {
        "customAttributeName": "Attribute 4",
        "customAttributeValue": "Value 4"
      }
    ],
  }
]

我想解析这个并有一个看起来像这样的css输出:

authType, status, attribute 1, attribute 2, attribute 3, attribute 4
"1", "active", "value1", "value2", "value3",""
"1", "active", "value1", "value2", "value3","value 4"

json 在数组中有超过 180k 条记录,因此需要遍历所有记录。有些记录没有所有属性。有些人全部有 4 个,但有些人只有 1 个。我希望在 csv 中为没有该属性的记录显示一个空值。

标签: jsonexport-to-csvjq

解决方案


使用您的示例输入,以下程序不依赖于“属性”键的顺序:

jq -r '
["Attribute 1", "Attribute 2", "Attribute 3", "Attribute 4"] as $attributes
# Header row
| ["authType", "status"] 
  + ($attributes | map( (.[:1] | ascii_upcase) + .[1:])),
# Data rows:
  (.[]
   | (INDEX(.userCustomAttributes[]; .customAttributeName)
      | map_values(.customAttributeValue)) as $dict
   | [.auth, .status] + [ $dict[ $attributes[] ] ]
   )
| @csv
'

生成以下 CSV:

"authType","status","Attribute 1","Attribute 2","Attribute 3","Attribute 4"
1,"Active","Value 1","Value 2","Value 3",
1,"Active","Value 1","Value 2","Value 3","Value 4"

您可以轻松修改它以发出您选择的文字字符串来代替 JSON 空值。

解释

$dict[ $a[] ]产生值流:

$dict[ $a[0] ]
$dict[ $a[1] ]
...

这用于确保以正确的顺序生成列,而与键的顺序甚至是否存在无关。


推荐阅读