python - 将嵌套的 Json 文件展平为 pandas 数据框
问题描述
我有这个 json 文件
{
"OrderMaster": {
"Order": {
"item": [{
"row_id": "1-2LDPVI0",
"sequence_id": "3851101",
"end_date": "",
"name": "TV-Discount",
"orderable": "Y",
"period": "",
"period_uom": "",
"phone_number_flag": "N",
"price_type": "Recurring",
"product_category": "mobilepackage",
"product_sub_category": "Discount",
"product_type_code": "Product",
"type": "PhoneOrder",
"vendor_part_number": "",
"created_date": "2018-02-16 09:09:24",
"created_by": "id123",
"last_updated_date": "2020-09-14 09:39:24",
"last_updated_by": "id123",
"ts_event_notification_time": "2020-09-14 09:40:69",
"OrderItems": {
"item": [{
"original_list_price": "0",
"order_list_id": "1-4ABU",
"order_list_name": "SEK Pricelist",
"product_id": "1-2LDPUKX",
"start_date": "2018-02-17 00:00:00"
},
{
"original_list_price": "45",
"order_list_id": "1-4AFU",
"order_list_name": "SEK Pricelist",
"product_id": "1-2LGSDFUKX",
"start_date": "2018-02-18 00:04:20"
}]
}
},
{
"row_id": "1-2LDPVI0",
"sequence_id": "3851101",
"end_date": "",
"name": "TV-Discount",
"orderable": "Y",
"period": "",
"period_uom": "",
"phone_number_flag": "N",
"price_type": "Recurring",
"product_category": "mobilepackage",
"product_sub_category": "Discount",
"product_type_code": "Product",
"type": "PhoneOrder",
"vendor_part_number": "",
"created_date": "2018-02-16 09:19:24",
"created_by": "id123",
"last_updated_date": "2020-09-15 09:39:24",
"last_updated_by": "id123",
"ts_event_notification_time": "2020-09-14 09:40:28",
"OrderItems": {
"item": [{
"original_list_price": "42",
"order_list_id": "1-4ABU",
"order_list_name": "SEK Pricelist",
"product_id": "1-2LDPUKX",
"start_date": "2018-02-19 00:00:00"
},
{
"original_list_price": "42",
"order_list_id": "1-4ASU",
"order_list_name": "SEK Pricelist",
"product_id": "1-2LDDAKX",
"start_date": "2018-02-12 00:00:00"
},
{
"original_list_price": "43",
"order_list_id": "1-4FDBU",
"order_list_name": "SEK Pricelist",
"product_id": "1-2LDFSDFKX",
"start_date": "2018-02-11 00:00:00"
}]
}
}]
}
}
}
到目前为止,我已经设法做到这一点 但是我对最后一个嵌套列“OrderItem”列有问题。我设法提取了它,但很难弄清楚如何将它们连接在一起,就像在目标结果中一样。
解决方案
我设法通过使用带有正确参数集的 json_normalise 来解决这个问题
with open(file_path) as f:
data = json.load(f)
# Define feature list for dataframe
features = [
"row_id",
"sequence_id",
"end_date",
"name",
"orderable",
"period",
"period_uom",
"phone_number_flag",
"price_type",
"product_category",
"product_sub_category",
"product_type_code",
"type",
"vendor_part_number",
"created_date",
"created_by",
"last_updated_date",
"last_updated_by",
"ts_event_notification_time"
]
# Create dataframe using json_normalize pandas function with necessary parameters
df = pd.json_normalize(data['OrderMaster']['Order']['item'],['OrderItems', 'item'], features)
结果是每个项目的完整数据行:
推荐阅读
- xamarin - App MediaFrameReader 总是返回空位图
- powershell - Powershell:我可以使用 Get-WmiObject Win32_NetworkAdapterConfiguration 从非活动网络适配器获取 IP 地址吗?
- php - php - csv 导出添加绝对路径输出
- c - 如何在 system() 中使用 C 代码变量
- flutter - SingleChildScrollView not working in a nested Column. My second column is not scrolling using SingleChildScrollView
- java - 迭代Java中的列表
- flutter - 如何在 Flutter 中创建包含对象的下拉列表
- javascript - 使用 puppeteer 和 node.js 下载“数据:”图像/文件
- laravel - Auth::Attempt without password if its social login
- r - R ggplot2 添加图例