python - 使用 json_normalize 对带有嵌套数组的 json 进行规范化
问题描述
我想规范化以下 JSON:
[
{
"studentId": 1,
"studentName": "James",
"schools": [
{
"schoolId": 1,
"classRooms": [
{
"classRoomId": {
"id": 1,
"floor": 2
}
},
{
"classRoomId": {
"id": 3
}
},
],
"teachers": [
{
"teacherId": 1,
"teacherName": "Tom"
},
{
"teacherId": 2,
"teacherName": "Sarah"
}
]
},
{
"schoolId": 2,
"classRooms": [
{
"classRoomId": {
"id": 4
}
}
],
"teachers": [
{
"teacherId": 1,
"teacherName": "Tom"
},
{
"teacherId": 2,
"teacherName": "Sarah"
},
{
"teacherId": 3,
"teacherName": "Tara"
}
]
}
]
}
]
我想在 Python 中获得下表(表格形式):
studentId studentName schoolId classRoomId.id classRoomId.floor teacherId
teacherName
1 James 1 1 2 1 Tom
1 James 1 1 2 2 Sarah
1 James 1 3 1 Tom
1 James 1 3 2 Sarah
1 James 2 4 1 Tom
1 James 2 4 2 Sarah
1 James 2 4 3 Tara
我试过像这样使用 Pandas json_normalize 函数:
df1 = json_normalize(test1, ["schools","teachers"], ["studentId", "studentName",["schools","teachers"]])
df2 = json_normalize(test1, ["schools","classRooms"], ["studentId", "studentName",["schools","classRooms"]])
df = pd.concat([df1,df2],axis=1)
但这并没有给我我需要的结构。
它不必在 Pandas 中,Python 中的任何其他库或代码都可以。任何帮助都将不胜感激。谢谢你。
解决方案
因为classRooms
并teachers
形成了 JSON 的两个不同子树,您将不得不解析它们两次:
classrooms = pd.io.json.json_normalize(json_data, ['schools', 'classRooms'], meta=[
'studentId',
'studentName',
['schools', 'schoolId']
])
teachers = pd.io.json.json_normalize(json_data, ['schools', 'teachers'], meta=[
'studentId',
['schools', 'schoolId']
])
# Merge and rearrange the columns in the order of your sample output
classrooms.merge(teachers, on=['schools.schoolId', 'studentId']) \
[['studentId', 'studentName', 'schools.schoolId', 'classRoomId.id', 'classRoomId.floor', 'teacherId', 'teacherName']]
推荐阅读
- oracle - 立即执行 - ORA-00904: STRING: 无效的标识符
- azure-active-directory - OneDrive 通过 Microsoft Graph:无法刷新令牌:错误 AADSTS70011:未为此租户配置范围“”
- c++ - 字符串不落在另一个字符串中间的逻辑
- redmine - 如何从Redmine问题中删除历史项目
- c++ - 如何在迭代器中使用 it->empty()
- loops - 我想计算给定 x, y 的 sum(sigma) 而 1<=i<=10 像这样: (x+y)^(i^2)-i+1 没有 pow() 函数
- html - HTML中选择的右缩进
- python - 在循环中使用 kivy 小部件
- android - Google Play 报告了应用程序崩溃,但没有指出在哪里
- machine-learning - 如何在 gensim word2vec 模型中寻找二元组相似度