python - 解析 json 文件时获取 0 条记录,如果关键属性不存在
问题描述
我有几个静态键列 EmployeeId、type 和几个来自第一个 FOR 循环的列。
而在第二个 FOR 循环中,如果我有一个特定的键,那么只有值应该附加到现有的数据框列,否则无论从第一个 for 循环获取的列应该保持不变。
首先 For 循环输出:
"EmployeeId","type","KeyColumn","Start","End","Country","Target","CountryId","TargetId"
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","","",""
在第二个 For 循环之后,我有以下输出:
"EmployeeId","type","KeyColumn","Start","End","Country","Target","CountryId","TargetId"
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","AMAZON","1",""
"Emp1","Metal","1212121212","2000-06-17","9999-12-31","","FLIPKART","2",""
根据代码,如果我有可用的员工标签,我有超过 2 条记录,但我可能有几个没有员工标签的 json 文件,那么输出应该与第一个循环输出相同,所有关键字段都已填充,其余列为空。
但是根据我的代码,我得到了 0 条记录。如果我的编码方式错误,请帮助我。
请帮助我......如果提问的方式不清楚,我很抱歉,因为我是 python 新手。请在以下链接中找到示例数据
请找到下面的代码
for i in range(len(json_file['enty'])):
temp = {}
temp['EmployeeId'] = json_file['enty'][i]['id']
temp['type'] = json_file['enty'][i]['type']
for key in json_file['enty'][i]['data']['attributes'].keys():
try:
temp[key] = json_file['enty'][i]['data']['attributes'][key]['values'][0]['value']
except:
temp[key] = None
for key in json_file['enty'][i]['data']['attributes'].keys():
if(key == 'Employee'):
for j in range(len(json_file['enty'][i]['data']['attributes']['Employee']['group'])):
for key in json_file['enty'][i]['data']['attributes']['Employee']['group'][j].keys():
try:
temp[key] = json_file['enty'][i]['data']['attributes']['Employee']['group'][j][key]['values'][0]['value']
except:
temp[key] = None
temp_df = pd.DataFrame([temp])
df = pd.concat([df, temp_df], sort=True)
# Rearranging columns
df = df[['EmployeeId', 'type'] + [col for col in df.columns if col not in ['EmployeeId', 'type']]]
# Writing the dataset
df[columns_list].to_csv("Test22.csv", index=False, quotechar='"', quoting=1)
如果员工标签不可用,我将获得 0 条记录作为输出,但我希望有 1 条记录作为第一个 for 循环
解决方案
JSON结构相当复杂。我试图简化从中收集的数据。结果是一个平面字典列表。该代码处理未找到“员工”的情况。
import copy
d = {
"enty": [
{
"id": "Emp1",
"type": "Metal",
"data": {
"attributes": {
"KeyColumn": {
"values": [
{
"value": 1212121212
}
]
},
"End": {
"values": [
{
"value": "2050-12-31"
}
]
},
"Start": {
"values": [
{
"value": "2000-06-17"
}
]
},
"Employee": {
"group": [
{
"Target": {
"values": [
{
"value": "AMAZON"
}
]
},
"CountryId": {
"values": [
{
"value": "1"
}
]
}
},
{
"Target": {
"values": [
{
"value": "FLIPKART"
}
]
},
"CountryId": {
"values": [
{
"value": "2"
}
]
}
}
]
}
}
}
}
]
}
emps = []
for e in d['enty']:
entry = {'id': e['id'], 'type': e['type']}
for x in ["KeyColumn", "Start", "End"]:
entry[x] = e['data']['attributes'][x]['values'][0]['value']
if e['data']['attributes'].get('Employee'):
for grp in e['data']['attributes']['Employee']['group']:
clone = copy.deepcopy(entry)
for x in ['Target', 'CountryId']:
clone[x] = grp[x]['values'][0]['value']
emps.append(clone)
else:
emps.add(entry)
# TODO write to csv
for emp in emps:
print(emp)
输出
{'End': '2050-12-31', 'Target': 'AMAZON', 'KeyColumn': 1212121212, 'Start': '2000-06-17', 'CountryId': '1', 'type': 'Metal', 'id': 'Emp1'}
{'End': '2050-12-31', 'Target': 'FLIPKART', 'KeyColumn': 1212121212, 'Start': '2000-06-17', 'CountryId': '2', 'type': 'Metal', 'id': 'Emp1'}
推荐阅读
- python - BeautifulSoup 的查找函数返回包含特定搜索词的所有内容,而不仅仅是精确匹配
- java - Java - 签名算法的算法约束检查失败:RSASSA-PSS 尝试使用 SSLContext 打开 LDAP 连接
- c++ - c++中变量和指针的区别
- linux - Git 使用 cronjob 拉取服务器
- c++ - c++:每次打印向量时如何打印向量元素的索引?
- javascript - 不要去下一个然后承诺,直到前一个完成
- split - 如何形成 Splunk 查询以根据最大分区数将字段拆分为单独的字段?
- arrays - 附加不同的维度数组
- php - 使用 Ajax 自动完成在 select2 上选择的选项
- jquery - 无法在 KendoUI 上的 Treeview 上进行选择