首页 > 解决方案 > 展平深度嵌套的 JSON 垂直转换为 pandas

问题描述

嗨,我正在尝试展平 JSON 文件,但无法。我的 JSON 有 3 个缩进重复示例,如下所示

floors": [
        {
            "uuid": "8474",
            "name": "some value",
            "areas": [
                {
                    "uuid": "xyz",
                    "**name**": "qwe",
                    "roomType": "Name1",
                    "templateUuid": "sdklfj",
                    "templateName": "asdf",
                    "templateVersion": "2.7.1",
                    "Required1": [
                        {
                            "**uuid**": "asdf",
                            "description": "asdf3",
                            "categoryName": "asdf",
                            "familyName": "asdf",
                            "productName": "asdf3",
                            "Required2": [
                                {
                                    "**deviceId**": "asdf",
                                    "**deviceUuid**": "asdf-asdf"
                                }
                            ]
                        }

我想要嵌套Required1中的对应值和Required1对应的required 2中的对应值。(在**中突出显示)我已经尝试如下JSON规范化但失败和其他免费库:

尝试:

from pprint import pprint
with open('Filename.json') as data_file:
    data_item = json.load(data_file)
Raw_Areas=json_normalize(data_item['floors'],'areas',errors='ignore',record_prefix='Area_')

不显示面积值。仅需要 1 需要 2 仍然嵌套

K=json_normalize(data_item['floors'][0],record_path=['Required1','Required2'],errors='ignore',record_prefix='Try_')

from flatten_json import flatten_json
Flat_J1= pd.DataFrame([flatten_json(data_item)]) 

希望获得如下值:预期列:floors.areas.Required1.Required2.deviceUuid floor.areas.name(并排)

请帮助我在尝试中遗漏了什么。我对 JSON 负载相当陌生。

标签: pythonpandas

解决方案


假设以下 JSON(正如多人指出的那样,它是不完整的)。所以我根据你的支架开口完成了它。

dct = {"floors": [
        {
            "uuid": "8474",
            "name": "some value",
            "areas": [
                {
                    "uuid": "xyz",
                    "name": "qwe",
                    "roomType": "Name1",
                    "templateUuid": "sdklfj",
                    "templateName": "asdf",
                    "templateVersion": "2.7.1",
                    "Required1": [
                        {
                            "uuid": "asdf",
                            "description": "asdf3",
                            "categoryName": "asdf",
                            "familyName": "asdf",
                            "productName": "asdf3",
                            "Required2": [
                                {
                                    "deviceId": "asdf",
                                    "deviceUuid": "asdf-asdf"
                                }
                            ]
                        }
                    ]
                }
            ]
        }
]}

您可以执行以下操作(需要 pandas 0.25.0)

df = pd.io.json.json_normalize(
    dct, record_path=['floors','areas', 'Required1'],meta=[['floors', 'areas', 'name']])
df = df.explode('Required2')
df = pd.concat([df, df["Required2"].apply(pd.Series)], axis=1)
df = df[['floors.areas.name', 'uuid', 'deviceId', 'deviceUuid']]

这使,

>>>     floors.areas.name   uuid    deviceId    deviceUuid
>>> 0   qwe asdf    asdf    asdf-asdf

推荐阅读