python - 使用 python pandas 将 JSON 文件转换为正确的格式
问题描述
我想将 JSON 文件转换为正确的格式。我有一个 JSON 文件,如下所示:
{
"fruit": "Apple",
"size": "Large",
"color": "Red",
"details":"|seedless:true|,|condition:New|"
},
{
"fruit": "Almond",
"size": "small",
"color": "brown",
"details":"|Type:dry|,|seedless:true|,|condition:New|"
}
您可以看到详细信息中的数据可能会有所不同。
我想把它改成:
{
"fruit": "Apple",
"size": "Large",
"color": "Red",
"seedless":"true",
"condition":"New",
},
{
"fruit": "Almond",
"size": "small",
"color": "brown",
"Type":"dry",
"seedless":"true",
"condition":"New",
}
我曾尝试在 python 中使用 pandas 作为:
import json
import pandas as pd
import re
df = pd.read_json("data.json",lines=True)
#I tried to change the pattern of data in details column as
re1 = re.compile('r/|(.?):(.?)|/')
re2 = re.compile('r\"(.*?)\":\"(.*?)\"')
df.replace({'details' :re1}, {'details' : re2},inplace = True, regex = True);
但是,在详细信息列的所有行中将输出作为“对象”。
解决方案
You can convert the (list of) dictionaries to a pandas data frame.
import pandas as pd
# data is a list of dictionaries
data = [{
"fruit": "Apple",
"size": "Large",
"color": "Red",
"details":"|seedless:true|,|condition:New|"
},
{
"fruit": "Almond",
"size": "small",
"color": "brown",
"details":"|Type:dry,|seedless:true|,|condition:New|"
}]
# convert to data frame
df = pd.DataFrame(data)
# remove '|' from details and convert to list
df['details'] = df['details'].str.replace(r'\|', '').str.split(',')
# explode list => one row for each element
df = df.explode('details')
# split details into name/value pair
df[['name', 'value']] = df['details'].str.split(':').apply(lambda x: pd.Series(x))
# drop details column
df = df.drop(columns='details')
print(df)
fruit size color name value
0 Apple Large Red seedless true
0 Apple Large Red condition New
1 Almond small brown Type dry
1 Almond small brown seedless true
1 Almond small brown condition New
推荐阅读
- javascript - MongoClient GetData 例程的 NodeJS ASync 调用
- c++ - (Unreal) VS 2019 找不到 GameFramework 文件夹,还有很多其他错误
- javascript - NodeJS 服务器外部访问
- flutter - Flutter 图像生成器回调问题
- reactjs - AuthError - 错误:未正确配置 Amplify
- python - Django Plotly Dash
- javascript - 将二维数组中的不同值推送到唯一数组
- sql-server - 无法删除非对称密钥,因为有一个登录映射到它
- java - 有没有办法使用带有继承的记录?
- vue.js - Vue 中的笑话测试