首页 > 解决方案 > 解析 json 文件时获取 0 条记录,如果关键属性不存在

问题描述

我有几个静态键列 EmployeeId、type 和几个来自第一个 FOR 循环的列。

而在第二个 FOR 循环中,如果我有一个特定的键,那么只有值应该附加到现有的数据框列,否则无论从第一个 for 循环获取的列应该保持不变。

首先 For 循环输出:

"EmployeeId","type","KeyColumn","Start","End","Country","Target","CountryId","TargetId"
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","","",""

在第二个 For 循环之后,我有以下输出:

"EmployeeId","type","KeyColumn","Start","End","Country","Target","CountryId","TargetId"
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","AMAZON","1",""
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","FLIPKART","2",""

根据代码,如果我有可用的员工标签,我有超过 2 条记录,但我可能有几个没有员工标签的 json 文件,那么输出应该与第一个循环输出相同,所有关键字段都已填充,其余列为空。

但是根据我的代码,我得到了 0 条记录。如果我的编码方式错误,请帮助我。

请帮助我......如果提问的方式不清楚,我很抱歉,因为我是 python 新手。请在以下链接中找到示例数据

请找到下面的代码

    for i in range(len(json_file['enty'])):
        temp = {}
        temp['EmployeeId'] = json_file['enty'][i]['id']
        temp['type'] = json_file['enty'][i]['type']
        for key in json_file['enty'][i]['data']['attributes'].keys():        
            try:
                temp[key] = json_file['enty'][i]['data']['attributes'][key]['values'][0]['value']
            except:
                temp[key] = None      

        for key in json_file['enty'][i]['data']['attributes'].keys(): 
            if(key == 'Employee'):
                for j in range(len(json_file['enty'][i]['data']['attributes']['Employee']['group'])):
                    for key in json_file['enty'][i]['data']['attributes']['Employee']['group'][j].keys():
                        try:
                            temp[key] = json_file['enty'][i]['data']['attributes']['Employee']['group'][j][key]['values'][0]['value']
                        except:
                            temp[key] = None

                    temp_df = pd.DataFrame([temp])
                    df = pd.concat([df, temp_df], sort=True)

    # Rearranging columns
    df = df[['EmployeeId', 'type'] + [col for col in df.columns if col not in ['EmployeeId', 'type']]]

    # Writing the dataset
    df[columns_list].to_csv("Test22.csv", index=False, quotechar='"', quoting=1)

如果员工标签不可用,我将获得 0 条记录作为输出,但我希望有 1 条记录作为第一个 for 循环

在此处输入链接描述

标签: pythonjsonpandascsvdataframe

解决方案


JSON结构相当复杂。我试图简化从中收集的数据。结果是一个平面字典列表。该代码处理未找到“员工”的情况。

import copy

d = {
    "enty": [
        {
            "id": "Emp1",
            "type": "Metal",
            "data": {
                "attributes": {
                    "KeyColumn": {
                        "values": [
                            {
                                "value": 1212121212
                            }
                        ]
                    },
                    "End": {
                        "values": [
                            {
                                "value": "2050-12-31"
                            }
                        ]
                    },
                    "Start": {
                        "values": [
                            {
                                "value": "2000-06-17"
                            }
                        ]
                    },
                    "Employee": {
                        "group": [
                            {
                                "Target": {
                                    "values": [
                                        {
                                            "value": "AMAZON"
                                        }
                                    ]
                                },
                                "CountryId": {
                                    "values": [
                                        {
                                            "value": "1"
                                        }
                                    ]
                                }
                            },
                            {
                                "Target": {
                                    "values": [
                                        {
                                            "value": "FLIPKART"
                                        }
                                    ]
                                },
                                "CountryId": {
                                    "values": [
                                        {
                                            "value": "2"
                                        }
                                    ]
                                }
                            }
                        ]
                    }
                }
            }
        }
    ]
}
emps = []
for e in d['enty']:
    entry = {'id': e['id'], 'type': e['type']}
    for x in ["KeyColumn", "Start", "End"]:
        entry[x] = e['data']['attributes'][x]['values'][0]['value']
    if e['data']['attributes'].get('Employee'):
        for grp in e['data']['attributes']['Employee']['group']:
            clone = copy.deepcopy(entry)
            for x in ['Target', 'CountryId']:
                clone[x] = grp[x]['values'][0]['value']
            emps.append(clone)
    else:
        emps.add(entry)
# TODO write to csv
for emp in emps:
    print(emp) 

输出

{'End': '2050-12-31', 'Target': 'AMAZON', 'KeyColumn': 1212121212, 'Start': '2000-06-17', 'CountryId': '1', 'type': 'Metal', 'id': 'Emp1'}
{'End': '2050-12-31', 'Target': 'FLIPKART', 'KeyColumn': 1212121212, 'Start': '2000-06-17', 'CountryId': '2', 'type': 'Metal', 'id': 'Emp1'}

推荐阅读