python - 使用 pandas 将 json 转换为 csv
问题描述
我有一个 json 文件,我阅读并尝试将其转换为 csv
"items": [
"id": "CITY",
"info": [
{
"id": 0,
"type": "box",
"attributes": {
"category": "Tree",
},
"group": 0,
"z_order": 0,
"box": [
223.54,
1.13,
27.3,
2.13
]
},
{
"id": 0,
"type": "box",
"attributes": {
"category": "Building",
},
"group": 0,
"z_order": 0,
"bbox": [
9.91,
64.21,
313.1,
13.09
]
}
],
"attr": {
"frame": 47
},
"image": {
"size": [
3024,
4032
],
"path": "photo2.jpeg"
}
},
这是我的代码片段,我试过了:
df = pd.DataFrame(data["items"])
输出是它创建这些表:
id,info,attr.frame,image.size,image.path
我希望有更多的列输出,例如
info.attributes, info.box, info.image
有什么帮助吗?谢谢!
解决方案
当然不是最漂亮的解决方案,但它正在工作,它可以帮助找到更好的解决方案:
df = pd.read_json(json.dumps(data))['items'].apply(pd.Series).explode('info')
df['image.size'] = df['image'].apply(pd.Series)['size']
df['image.path'] = df['image'].apply(pd.Series)['path']
df['attr.frame'] = df['attr'].apply(pd.Series)['frame']
df['info.id'] = df['info'].apply(pd.Series)['id']
df['info.type'] = df['info'].apply(pd.Series)['type']
df['info.attributes'] = df['info'].apply(pd.Series)['attributes']
df['info.attributes.category'] = df['info.attributes'].apply(pd.Series)['category']
df['info.group'] = df['info'].apply(pd.Series)['group']
df['info.z_order'] = df['info'].apply(pd.Series)['z_order']
df['info.box'] = df['info'].apply(pd.Series)['box']
df.drop(columns=['info', 'attr', 'info.attributes', 'image'], inplace=True)
第一行是在info中逐个元素创建一行,最后一行去掉里面有dictionnary的那一列,避免冗余信息。