首页 > 解决方案 > 读取也是对象的关键节点时出现熊猫错误

问题描述

我有一个看起来像这样的简单 JSON

{"dV":201,"data1":{"test":"ok","data2":[{"id":1,"summary":{"openingBalance":"-7583.48","totalCredits":"1203.52"},"additionalDetails":{"email":"XXXXXXXX@outlook.com","phone":"XXXX XXX 333"}}]}}

我通过执行以下操作来规范这个 JSON

import pandas as pd
textInJSON = '{"dV":201,"data1":{"test":"ok","data2":[{"id":1,"summary":{"openingBalance":"-7583.48","totalCredits":"1203.52"},"additionalDetails":{"email":"XXXXXXXX@outlook.com","phone":"XXXX XXX 333"}}]}}'
d = pd.read_json(textInJSON)
df = pd.json_normalize(d['data1']['data2'])

为什么我在做这样的事情时会出错?

df['additionalDetails']

但是当我执行以下行之类的操作时,我可以检索信息(XXXXXXXX@outlook.com):

df['additionalDetails.email']

因为我认为我应该能够做到:

df['additionalDetails']['email']

标签: pythonpandas

解决方案


如果您打印数据框

print(df)

你会发现你得到的实际上是这样的:

id summary.openingBalance summary.totalCredits additionalDetails.email additionalDetails.phone
01               -7583.48              1203.52 XXXXXXXX@outlook.com    XXXX XXX 333

这告诉我你summary的实际上是一本字典。

因此,当您这样做时,pd.json_normalize(d['data1']['data2'])熊猫需要以某种方式将该嵌套转换为数据框。

这就是为什么需要以您显示的方式访问这些值的原因。

这是您数据的更好可视化:

textInJSON = '''{
    "dV": 201,
    "data1": {
        "test": "ok",
        "data2": [{
            "id": 1,
            "summary": {
                "openingBalance": "-7583.48",
                "totalCredits": "1203.52"
            },
            "additionalDetails": {
                "email": "XXXXXXXX@outlook.com",
                "phone": "XXXX XXX 333"
            }
        }]
    }
}'''

推荐阅读