首页 > 解决方案 > 将条目视为主键打印一次,打印关联的条目数组,作为 CSV,删除空

问题描述

我有这样的记录,有时有重复的srcPath条目,尽管不同的references.

例如/content/dam/foo/about-bar/photos/rayDavis.PNG在一条记录中出现 3 次,不同的references.

我想srcPath打印一次唯一的,以及相关的references.

我也有空记录,

{
  "pages": []
}

我不想看到那些。

我真的很想要一个csv:

srcPath,也许是不同的字段,例如published,以及 first reference, second reference,thirdreference等——关联references数组作为同一行上的连续逗号分隔值,例如:

"/content/dam/foo/about-bar/pdf/theplan.pdf", true, "/content/foo/en/about-bar/the-plan-and-vision/jcr:content/content2/image/link", "/content/foo/en/about-bar/the-plan-and-vision/jcr:content/content2/textboximg/boxFtr", "/content/foo/en/about-bar/the-plan-and-vision/jcr:content/content1/textboximg/text"

"/content/dam/foo/about-bar/photos/rayDavis.PNG", true, "/content/foo/en/about-bar/jcr:content/content1B/promos_1/image/fileReference", "/content/foo/en/about-bar/monkey-development/tales-of-giving/ray-moose-davis/jcr:content/content1/textboximg/fileReference", "/content/foo/en/about-bar/monkey-development/tales-of-giving/jcr:content/content1/textboximg_2/fileReference"

"/content/dam/foo/about-bar/pdf/foo_19thNewsletter.pdf", true, "/content/foo/en/gremlins/stay-tuned/jcr:content/content3/textboximg/text"

"/content/dam/foo/about-bar/pdf/barNews_fall1617.pdf", true, "/content/foo/en/gremlins/jcr:content/content2C/textboximg_114671747/text", "/content/dam/foo/about-bar/pdf/barNews_fall1617.pdf", "/content/foo/en/gremlins/stay-tuned/jcr:content/content3/textboximg_0/text"

换句话说,srcPath具有关联的唯一条目references

我想如果我path也想要,我将无法srcPath在 csv 中拥有独特的线条?

数据:

{
  "pages": [
    {
      "srcPath": "/content/dam/foo/about-bar/pdf/theplan.pdf",
      "srcTitle": "theplan.pdf",
      "path": "/content/foo/en/about-bar/the-plan-and-vision",
      "title": "the Plan and Vision",
      "references": [
        "/content/foo/en/about-bar/the-plan-and-vision/jcr:content/content2/image/link",
        "/content/foo/en/about-bar/the-plan-and-vision/jcr:content/content2/textboximg/boxFtr",
        "/content/foo/en/about-bar/the-plan-and-vision/jcr:content/content1/textboximg/text"
      ],
      "published": false,
      "isPage": "true"
    }
  ]
}






{
  "pages": []
}



{
  "pages": []
}


{
  "pages": [
    {
      "srcPath": "/content/dam/foo/about-bar/photos/rayDavis.PNG",
      "srcTitle": "rayDavis.PNG",
      "path": "/content/foo/en/about-bar",
      "title": "About bar",
      "references": [
        "/content/foo/en/about-bar/jcr:content/content1B/promos_1/image/fileReference"
      ],
      "published": true,
      "isPage": "true"
    },
    {
      "srcPath": "/content/dam/foo/about-bar/photos/rayDavis.PNG",
      "srcTitle": "rayDavis.PNG",
      "path": "/content/foo/en/about-bar/monkey-development/tales-of-giving/ray-moose-davis",
      "title": "ray moose Davis",
      "references": [
        "/content/foo/en/about-bar/monkey-development/tales-of-giving/ray-moose-davis/jcr:content/content1/textboximg/fileReference"
      ],
      "published": true,
      "isPage": "true"
    },
    {
      "srcPath": "/content/dam/foo/about-bar/photos/rayDavis.PNG",
      "srcTitle": "rayDavis.PNG",
      "path": "/content/foo/en/about-bar/monkey-development/tales-of-giving",
      "title": "tales of Giving",
      "references": [
        "/content/foo/en/about-bar/monkey-development/tales-of-giving/jcr:content/content1/textboximg_2/fileReference"
      ],
      "published": true,
      "isPage": "true"
    }
  ]
}









{
  "pages": [
    {
      "srcPath": "/content/dam/foo/about-bar/pdf/foo_19thNewsletter.pdf",
      "srcTitle": "foo_19thNewsletter.pdf",
      "path": "/content/foo/en/gremlins/stay-tuned",
      "title": "Stay tuned",
      "references": [
        "/content/foo/en/gremlins/stay-tuned/jcr:content/content3/textboximg/text"
      ],
      "published": true,
      "isPage": "true"
    }
  ]
}









{
  "pages": [
    {
      "srcPath": "/content/dam/foo/about-bar/pdf/barNews_fall1617.pdf",
      "srcTitle": "barNews_fall1617.pdf",
      "path": "/content/foo/en/gremlins",
      "title": "gremlins",
      "references": [
        "/content/foo/en/gremlins/jcr:content/content2C/textboximg_114671747/text"
      ],
      "published": true,
      "isPage": "true"
    },
    {
      "srcPath": "/content/dam/foo/about-bar/pdf/barNews_fall1617.pdf",
      "srcTitle": "barNews_fall1617.pdf",
      "path": "/content/foo/en/gremlins/stay-tuned",
      "title": "Stay tuned",
      "references": [
        "/content/foo/en/gremlins/stay-tuned/jcr:content/content3/textboximg_0/text"
      ],
      "published": true,
      "isPage": "true"
    }
  ]
}

标签: jq

解决方案


您可以使用以下内容:

jq --raw-output '.pages | group_by(.srcPath)[] | [.[0].srcPath, .[0].published, .[].references[]] | @csv'

我们按 srcPath 对页面进行分组,并将每个组映射到一个数组中,该数组包含组的第一个元素的 srcPath 和发布的以及组中每个元素的引用。这些数组中的每一个都将是 CSV 结果中的一行。

在这里试试!


推荐阅读