首页 > 解决方案 > json_normalize 不读取所有数据

问题描述

我有一个json文件,我想将其展平并将所有信息检索到 pandas 数据框中。该json文件如下所示:

jsonstr = {
  "calculation": {
    "id": "3k3k3k3kwk3kwk",
    "Id": 23,
    "submissionDate": 1622428064679,
    "serverVersion": "3.3.5.6.r",
    "tag": [
      {
        "code": "qq4059331155113278",
        "manual": {
          "location": {
            "x": 26.5717,
            "y": 59.4313,
            "z": 0.0,
            "floor": 0
          },
          "timestamp": 1599486138000
        },
        "device": null,
        "measurements": [
          {
            "Address": "D_333",
            "subcell": "",
            "frequency": 14.0,
            "dfId": 0
          },
          {
            "trxAddress": "D_334",
            "subcell": "",
            "frequency": 11.0,
            "dfId": 0
          }]
    }]
}
}

现在,像往常一样,我执行以下操作。我认为这将返回所有“字段”,包括id,Id等等submissionDate

import os, json
import pandas as pd
import numpy as np
import glob
pd.set_option('display.max_columns', None)

file = './Testjson.json'
#file = './jsondumps/ff80818178f93bd90179ab51781e1c95.json'
with open(file) as json_string:
    jsonstr = json.load(json_string)

labels = pd.json_normalize(jsonstr, record_path=['calculation','tag'])

但实际上,它返回:

   code device  \
0  qq4059331155113278   None   

                                        measurements  manual.location.x  \
0  [{'Address': 'D_333', 'subcell': '', 'frequenc...            26.5717   

   manual.location.y  manual.location.z  manual.location.floor  \
0            59.4313                0.0                      0   

   manual.timestamp  
0     1599486138000

并尝试以下

labels = pd.json_normalize(jsonstr, record_path=['calculation','tag'], meta=['id', 'Id'])

返回错误:

KeyError: 'id'

这是有道理的。但是我一开始做错了什么?为什么我不能得到所有字段,calculation因为它们在路径中?

非常适合任何见解!

标签: jsonpandas

解决方案


您的语法与元参数略有不同。id并且Id位于数据框的末尾。

如果您希望展平整个 json,请查看flatten_json. 这是一个与嵌套 json 一起使用的非常好的库。

pd.json_normalize(jsonstr, record_path=['calculation','tag'],  meta=[['calculation','id'],['calculation','Id']])

                 code device                                       measurements  manual.location.x  manual.location.y  manual.location.z  manual.location.floor  manual.timestamp  calculation.id calculation.Id
0  qq4059331155113278   null  [{'Address': 'D_333', 'subcell': '', 'frequenc...            26.5717            59.4313                0.0                      0     1599486138000  3k3k3k3kwk3kwk             23

推荐阅读