python - 如何使用熊猫将嵌套字典(json)列表转换为自定义数据框?
问题描述
我正在通过 API 组合调查结果,并希望将这些结果转换为如下所示的数据框:
PersonId | Question | Answer | Department
为了实现这一点,每一行都必须是一个人的一个问答对,包括第一个问题的部门。所以在这种情况下,它应该如下所示:
PersonId | Question | Answer | Department
1 | I can focus on clear targets | 3 | Department A
1 | I am satisfied with my working environment| 4 | Department A
2 | I can focus on clear targets | 1 | Department B
2 | I am satisfied with my working environment| 3 | Department B
以下是从 api 检索数据并将其组合后的样子。我不需要 'answers' 和 'id' 键,因为 'results' 包含参与者给出的答案。答案总是在 1 到 5 的范围内。
[
{
'0': {
'title': 'What department do you work at?',
'id': '2571050',
'results': {
'0': 'Department A',
'1': '',
},
'answers': {
'0': 'Department A',
'1': 'Department B',
}
},
'1': {
'title': 'I can focus on clear targets',
'id': '5275962',
'results': {
'0': '3'
},
'answers': {
'0': 'Strongly disagree',
'1': 'Strongly Agree'
}
},
'2': {
'title': 'I am satisfied with my working environment',
'id': '5276045',
'results': {
'0': '4'
},
'answers': {
'0': 'Strongly Disagree',
'1': 'Strongly Agree'
}
},
},
{
'0': {
'title': 'What department do you work at?',
'id': '2571050',
'results': {
'0': '',
'1': 'Department B',
},
'answers': {
'0': 'Department A',
'1': 'Department B',
}
},
'1': {
'title': 'I can focus on clear targets',
'id': '5275962',
'results': {
'0': '1'
},
'answers': {
'0': 'Strongly disagree',
'1': 'Strongly Agree'
}
},
'2': {
'title': 'I am satisfied with my working environment',
'id': '5276048',
'results': {
'0': '3'
},
'answers': {
'0': 'Strongly Disagree',
'1': 'Strongly Agree'
}
}
}
]
解决方案
请注意您的 JSON 文件包含一些错误。字典的最后一个值的末尾不应有逗号。您还应该对字典的键/值使用双引号而不是单引号。我在答案的末尾链接了更正的 JSON 文件。
回到您的问题,您可以使用 json 和 pandas 库来解析您的文件。这是它的样子:
import json
import pandas as pd
df = pd.DataFrame({'PersonId' : [], 'Question' : [], 'Answer' : [], 'Department' : []})
i = 1
for people in data:
# We assign an id to the answerer
person_id = i
i += 1
#We retrieve the department of the answerer
if people['0']['results']['0'] != '':
department = people['0']['results']['0']
else:
department = people['0']['results']['1']
for answer in people:
#if we are not asking for the department :
new_row = {'PersonId' : person_id, 'Department' : department}
if answer != '0':
# We collect the question and the answer
new_row['Question'] = people[answer]['title']
new_row['Answer'] = people[answer]['results']['0']
df = df.append(new_row, ignore_index = True)
输出 :
PersonId Question Answer Department
0 1.0 I can focus on clear targets 3 Department A
1 1.0 I am satisfied with my working environment 4 Department A
2 2.0 I can focus on clear targets 1 Department B
3 2.0 I am satisfied with my working environment 3 Department B
JSON文件:
[
{
"0": {
"title": "What department do you work at?",
"id": "2571050",
"results": {
"0": "Department A",
"1": ""
},
"answers": {
"0": "Department A",
"1": "Department B"
}
},
"1": {
"title": "I can focus on clear targets",
"id": "5275962",
"results": {
"0": "3"
},
"answers": {
"0": "Strongly disagree",
"1": "Strongly Agree"
}
},
"2": {
"title": "I am satisfied with my working environment",
"id": "5276045",
"results": {
"0": "4"
},
"answers": {
"0": "Strongly Disagree",
"1": "Strongly Agree"
}
}
},
{
"0": {
"title": "What department do you work at?",
"id": "2571050",
"results": {
"0": "",
"1": "Department B"
},
"answers": {
"0": "Department A",
"1": "Department B"
}
},
"1": {
"title": "I can focus on clear targets",
"id": "5275962",
"results": {
"0": "1"
},
"answers": {
"0": "Strongly disagree",
"1": "Strongly Agree"
}
},
"2": {
"title": "I am satisfied with my working environment",
"id": "5276048",
"results": {
"0": "3"
},
"answers": {
"0": "Strongly Disagree",
"1": "Strongly Agree"
}
}
}
]
推荐阅读
- python - 在 Python 中处理 CSV 文件 - 零导入/无库
- c# - 在 linq 查询中对多个列进行内部联接的类型推断错误
- gemfile - Gemfire - 如何限制/限制区域的 OQL 查询
- typo3 - 已弃用:自动 TCA 迁移
- sql - 选择来自不同状态的 2 列
- python - 使用 Python 将整个数据框除以 2
- gitlab-ci - gitlab管道挂起,失败
- android - 即使指定了“match_parent”,使用 ConstraintLayout 的 RecyclerView 项目也不会填满屏幕的整个宽度
- rename - 如何为回归输出中的系数创建名称?
- python - Plotly 绘图未显示在 HTML 文件中