首页 > 解决方案 > 使用嵌套列名将嵌套的 json 展平为 csv

问题描述

我现在有一个非常奇怪的要求。我有以下 json 并且我必须以某种方式将其转换为平面 csv。

[
  {
    "authorizationQualifier": "SDA",
    "authorizationInformation": "          ",
    "securityQualifier": "ASD",
    "securityInformation": "          ",
    "senderQualifier": "ASDAD",
    "senderId": "FADA      ",
    "receiverQualifier": "ADSAS",
    "receiverId": "ADAD           ",
    "date": "140101",
    "time": "0730",
    "standardsId": null,
    "version": "00501",
    "interchangeControlNumber": "123456789",
    "acknowledgmentRequested": "0",
    "testIndicator": "T",
    "functionalGroups": [
      {
        "functionalIdentifierCode": "ADSAD",
        "applicationSenderCode": "ASDAD",
        "applicationReceiverCode": "ADSADS",
        "date": "20140101",
        "time": "07294900",
        "groupControlNumber": "123456789",
        "responsibleAgencyCode": "X",
        "version": "005010X221A1",
        "transactions": [
          {
            "name": "ASDADAD",
            "transactionSetIdentifierCode": "adADS",
            "transactionSetControlNumber": "123456789",
            "implementationConventionReference": null,
            "segments": [
              {
                "BPR03": "ad",
                "BPR14": "QWQWDQ",
                "BPR02": "1.57",
                "BPR13": "23223",
                "BPR01": "sad",
                "BPR12": "56",
                "BPR10": "32424",
                "BPR09": "12313",
                "BPR08": "DA",
                "BPR07": "123456789",
                "BPR06": "12313",
                "BPR05": "ASDADSAD",
                "BPR16": "21313",
                "BPR04": "SDADSAS",
                "BPR15": "11212",
                "id": "aDSASD"
              },
              {
                "TRN02": "2424",
                "TRN03": "35435345",
                "TRN01": "3435345",
                "id": "FSDF"
              },
              {
                "REF02": "fdsffs",
                "REF01": "sfsfs",
                "id": "fsfdsfd"
              },
              {
                "DTM02": "2432424",
                "id": "sfsfd",
                "DTM01": "234243"
              }
            ],
            "loops": [
              {
                "id": "24324234234",
                "segments": [
                  {
                    "N101": "sfsfsdf",
                    "N102": "sfsf",
                    "id": "dgfdgf"
                  },
                  {
                    "N301": "sfdssfdsfsf",
                    "N302": "effdssf",
                    "id": "fdssf"
                  },
                  {
                    "N401": "sdffssf",
                    "id": "sfds",
                    "N402": "sfdsf",
                    "N403": "23424"
                  },
                  {
                    "PER06": "Wsfsfdsfsf",
                    "PER05": "sfsf",
                    "PER04": "23424",
                    "PER03": "fdfbvcb",
                    "PER02": "Pedsdsf",
                    "PER01": "sfsfsf",
                    "id": "fdsdf"
                  }
                ]
              },
              {
                "id": "2342",
                "segments": [
                  {
                    "N101": "sdfsfds",
                    "N102": "vcbvcb",
                    "N103": "dsfsdfs",
                    "N104": "343443",
                    "id": "fdgfdg"
                  },
                  {
                    "N401": "dfsgdfg",
                    "id": "dfgdgdf",
                    "N402": "dgdgdg",
                    "N403": "234244"
                  },
                  {
                    "REF02": "23423342",
                    "REF01": "fsdfs",
                    "id": "sfdsfds"
                  }
                ]
              }
            ]
          }
        ]
      }
    ]
  }
]

与更深的键值对应的列标题名称采用嵌套形式,如functionalGroups[0].transactions[0].segments[0].BPR15.

我可以在java中使用这个github项目(在这里你可以在解释中找到我想要的输出格式)在一行中做到这一点:

flatJson = JSONFlattener.parseJson(new File("files/simple.json"), "UTF-8");

输出是:

date,securityQualifier,testIndicator,functionalGroups[1].functionalIdentifierCode,functionalGroups[1].date,functionalGroups[1].applicationReceiverCode, ...
140101,00,T,HP,20140101,ETIN,...

但我想在 python 中做到这一点。我按照此答案中的建议进行了尝试:

with open('data.json') as data_file:
    data = json.load(data_file)
df = json_normalize(data, record_prefix=True)

with open('temp2.csv', "w", newline='\n') as csv_file:
    csv_file.write(df.to_csv())

但是,对于 column functionalGroups,它将 json 转储为单元格值。

我也按照此答案中的建议进行了尝试:

with open('data.json') as f:  # this ensures opening and closing file
    a = json.loads(f.read())

df = pandas.DataFrame(a)

print(df.transpose())

但这似乎也做同样的事情:

                                                                          0
acknowledgmentRequested                                                   0
authorizationInformation                                                   
authorizationQualifier                                                  SDA
date                                                                 140101
functionalGroups          [{'functionalIdentifierCode': 'ADSAD', 'applic...
interchangeControlNumber                                          123456789
receiverId                                                  ADAD           
receiverQualifier                                                     ADSAS
securityInformation                                                        
securityQualifier                                                       ASD
senderId                                                         FADA      
senderQualifier                                                       ASDAD
standardsId                                                            None
testIndicator                                                             T
time                                                                   0730
version                                                               00501

是否可以在 python 中做我想做的事情?

标签: pythonpandas

解决方案


推荐阅读