python - 使用python从JSON文件中的嵌套列表中获取元素
问题描述
我正在阅读一个 JSON 文件并尝试从每个文件中提取以下信息log
并将其存储在 Excel 表中。
["name","TOM DOE"]
["value","132"]
我的 python 代码能够从 JSON 中提取一些其他必需的信息。但是,由于 JSON 包含多个嵌套列表和字典,我无法提取两个信息并存储它。请任何人都可以帮助我使用python代码。
这是我的python代码:
import json
import pandas
class ConvertToExcel:
def Process(self):
with open('C:/Users/Desktop/SampleTestFiles/new17.json') as json_file:
dataarray = json.load(json_file)
data1 = []
logcount = 0
for data in dataarray:
logcount = logcount + 1
for i in range(len(data['log'])):
code = data['log'][i]['code']
message = data['log'][i]['message']
try:
cid = data['log'][i]['message']['cid']
except:
cid = 0
text = data['log'][i]['message']['text']
refs = data['log'][i]['refs']
for k in range(len(data['log'][i]['refs'])):
try:
hrefs = data['log'][i]['refs'][k]['href']
except:
hrefs =''
try:
hrefsFacts = data['log'][i]['refs'][k]['href']
except:
hrefsFacts =''
# print(type(data['log'][i]['refs'][k]['href']['properties']))
# res1 = 'name' in chain(*data['log'][i]['refs'][k]['href']['properties'])
# for elem in len(data['log'][i]['refs'][k]['href']['properties']):
# for item in elem:
# if(item == 'val'):
# hrefspropertiesvalue = item
# else:
# pass
hrefspropertiesvalue = 'a'
level = data['log'][i]['level']
data1.append((logcount,i, code, message,cid, text, refs,hrefs,hrefsFacts, hrefspropertiesvalue, level))
pandas.DataFrame(data1, columns=['Log', 'Innerlog', 'code', 'message','cid','text','refs','hrefs','hrefsFacts','hrefspropertiesvalue', 'level']).to_excel("output.xlsx")
A = ConvertToExcel()
A.Process()
并且,JSON:
[{ "log": [
{
"code": "nikv.F1.all.1",
"message": {
"cid": "61785360",
"filing_url": "C:\\Users\\farizaleta\\Desktop\\test-428-2016Q4F1.abcd",
"severity": "error",
"text": "[nikv.F1.all.1] The values of 6,075,786 for the elements nikv:OtherChargestested is duplicated in the filing 2 times.\n\nElement : nikv:OtherChargestested\nPeriod : 2016-01-01 to 2016-12-31\n\n\nRule Id:nikv.F1.all.1 - test-428-2016Q4F1.abcd 4122"
},
"refs": [
{
"href": "test-428-2016Q4F1.abcd#f-743",
"sourceLine": 4122,
"properties": [
[
"label",
"Other charges, tested"
],
[
"namespace",
"http://nikv.com/form/2002-01-01/nikv"
],
[
"name",
"TOM DOE"
],
[
"QName",
"nikv:OtherChargestested"
],
[
"contextRef",
"c-01",
[
[
"entity",
"C002089",
[
[
"scheme",
"http://tested.com/entity/identification/scheme"
]
]
],
[
"startDate",
"2016-01-01"
],
[
"endDate",
"2016-12-31"
],
[
"dimensions",
"(1)",
[
[
"nikv:OfficerAxis",
"<nikv:OfficerDomain>0-1</nikv:OfficerDomain>\n\t\t\t\t\n"
]
]
]
]
],
[
"unitRef",
"u-02",
[
[
"measure",
"iso4217"
]
]
],
[
"decimals",
"INF"
],
[
"precision",
"None"
],
[
"xsi:nil",
"false"
],
[
"value",
"132"
]
],
"objectId": "91269"
}
],
"level": "error"
}]
}]
解决方案
您可以像这样遍历属性:
for type, val, *_ in data['log'][i]['refs'][k]['href']['properties']:
*_
忽略子列表中的任何额外元素properties
。
然后您可以测试是否type
是name
或value
并设置适当的变量。
def Process(self):
with open('C:/Users/Desktop/SampleTestFiles/new17.json') as json_file:
dataarray = json.load(json_file)
data1 = []
logcount = 0
for data in dataarray:
logcount = logcount + 1
for x in data['log']:
code = x['code']
message = x['message']
try:
cid = x['message']['cid']
except:
cid = 0
text = x['message']['text']
refs = x['refs']
name = ''
value = ''
for ref in x['refs']:
try:
hrefs = ref['href']
except:
hrefs =''
try:
hrefsFacts = ref['href']
except:
hrefsFacts =''
for type, val, *_ in ref['properties']:
if type = 'name':
name = val
elif type = 'value':
value = val
hrefspropertiesvalue = 'a'
level = x['level']
data1.append((logcount,i, code, message,cid, text, refs,hrefs,hrefsFacts, hrefspropertiesvalue, level, name, value))
pandas.DataFrame(data1, columns=['Log', 'Innerlog', 'code', 'message','cid','text','refs','hrefs','hrefsFacts','hrefspropertiesvalue', 'level', 'name', 'value']).to_excel("output.xlsx")
我还简化了所有要使用的循环,for <variable> in <list>
而不是for <indexvariable> in range(len(<list>))
.
推荐阅读
- c++ - STL C++ 中的映射给出了不特定的行为
- c++ - 每当 fs::path 被破坏时出现分段错误
- excel - 启动时多列列表框填充 - 运行时错误
- php - 如何将文件移动到 cpanel 文件夹中
- java - 如何仅使用命令行启用 slf4j 日志记录(无属性文件)
- terraform - 创建 SSL crt 并映射到 ELB | 地形
- excel - 根据另一个文件中的工作表名称激活工作表
- jquery - 尝试移动表单时使用 jQuery.detach() 进行不需要的克隆行为
- ios - 在隐藏的某些视图上缩小堆栈视图的高度
- jquery - 确定引导列内容是在左侧还是右侧(对于 Jquery 可排序)