python - 如何用python解析复杂的json?
问题描述
我正在尝试解析这个 json 文件,但遇到了麻烦。json 看起来像这样:
<ListObject list at 0x2161945a860> JSON: {
"data": [
{
"amount": 100,
"available_on": 1621382400,
"created": 1621264875,
"currency": "usd",
"description": "0123456",
"exchange_rate": null,
"fee": 266,
"fee_details": [
{
"amount": 266,
"application": null,
"currency": "usd",
"description": "processing fees",
"type": "fee"
}
],
"id": "txn_abvgd1234",
"net": 9999,
"object": "balance_transaction",
"reporting_category": "charge",
"source": "cust1",
"sourced_transfers": {
"data": [],
"has_more": false,
"object": "list",
"total_count": 0,
"url": "/v1/source"
},
"status": "pending",
"type": "charge"
},
{
"amount": 25984,
"available_on": 1621382400,
"created": 1621264866,
"currency": "usd",
"description": "0326489",
"exchange_rate": null,
"fee": 93,
"fee_details": [
{
"amount": 93,
"application": null,
"currency": "usd",
"description": "processing fees",
"type": "fee"
}
],
"id": "txn_65987jihgf4984oihydgrd",
"net": 9874,
"object": "balance_transaction",
"reporting_category": "charge",
"source": "cust2",
"sourced_transfers": {
"data": [],
"has_more": false,
"object": "list",
"total_count": 0,
"url": "/v1/source"
},
"status": "pending",
"type": "charge"
},
],
"has_more": true,
"object": "list",
"url": "/v1/balance_"
}
我正在尝试使用此脚本在 python 中解析它:
import pandas as pd
df = pd.json_normalize(json)
df.head()
但我得到的是:
我需要的是在自己的列中解析每个数据点。所以我将有 2 行数据,每个数据点都有列。像这样的东西:
我现在该怎么做?
解决方案
除了一个字段之外,所有字段都是 JSON 的直接副本,因此您只需列出可以复制的字段,然后对 fee_details 进行额外处理。
import json
import pandas as pd
inp = """{
"data": [
{
"amount": 100,
"available_on": 1621382400,
"created": 1621264875,
"currency": "usd",
"description": "0123456",
"exchange_rate": null,
"fee": 266,
"fee_details": [
{
"amount": 266,
"application": null,
"currency": "usd",
"description": "processing fees",
"type": "fee"
}
],
"id": "txn_abvgd1234",
"net": 9999,
"object": "balance_transaction",
"reporting_category": "charge",
"source": "cust1",
"sourced_transfers": {
"data": [],
"has_more": false,
"object": "list",
"total_count": 0,
"url": "/v1/source"
},
"status": "pending",
"type": "charge"
},
{
"amount": 25984,
"available_on": 1621382400,
"created": 1621264866,
"currency": "usd",
"description": "0326489",
"exchange_rate": null,
"fee": 93,
"fee_details": [
{
"amount": 93,
"application": null,
"currency": "usd",
"description": "processing fees",
"type": "fee"
}
],
"id": "txn_65987jihgf4984oihydgrd",
"net": 9874,
"object": "balance_transaction",
"reporting_category": "charge",
"source": "cust2",
"sourced_transfers": {
"data": [],
"has_more": false,
"object": "list",
"total_count": 0,
"url": "/v1/source"
},
"status": "pending",
"type": "charge"
}
],
"has_more": true,
"object": "list",
"url": "/v1/balance_"
}"""
copies = [
'id',
'net',
'object',
'reporting_category',
'source',
'amount',
'available_on',
'created',
'currency',
'description',
'exchange_rate',
'fee'
]
data = json.loads(inp)
rows = []
for inrow in data['data']:
outrow = {}
for copy in copies:
outrow[copy] = inrow[copy]
outrow['fee_details'] = inrow['fee_details'][0]['description']
rows.append(outrow)
df = pd.DataFrame(rows)
print(df)
输出:
timr@tims-gram:~/src$ python x.py
id net object reporting_category source amount ... created currency description exchange_rate fee fee_details
0 txn_abvgd1234 9999 balance_transaction charge cust1 100 ... 1621264875 usd 0123456 None 266 processing fees
1 txn_65987jihgf4984oihydgrd 9874 balance_transaction charge cust2 25984 ... 1621264866 usd 0326489 None 93 processing fees
[2 rows x 13 columns]
timr@tims-gram:~/src$
推荐阅读
- angular - 离子加载控制器关闭不起作用
- microservices - 为什么不为 DTO 创建一个 git 子模块,它将在多个微服务之间共享以避免代码重复?
- android - “import android.view.View”中的视图和视图是什么
- python - 每当我在管理员中添加新用户时,如何在 django 管理页面中添加 Id_card_number 输入字段?
- mongodb - 如何在 MongoDB 中存储整本书文本(用阿拉伯语/乌尔都语编写),这样我以后可以从中搜索文本?
- docker - 如何将参数传递给 docker ENTRYPOINT 中的 bash 文件?
- lisp - ecl_init_module 中的第一个参数有什么作用?
- vba - 在 PowerPoint 中修改树形图
- sql - T-sql 选择xml值
- javascript - 假点击到光滑轮播中的下一张幻灯片