首页 > 解决方案 > 使用 python pandas 将 JSON 文件转换为正确的格式


我想将 JSON 文件转换为正确的格式。我有一个 JSON 文件,如下所示:

    "fruit": "Apple",
    "size": "Large",
    "color": "Red",

    "fruit": "Almond",
    "size": "small",
    "color": "brown",




    "fruit": "Apple",
    "size": "Large",
    "color": "Red",

    "fruit": "Almond",
    "size": "small",
    "color": "brown",


我曾尝试在 python 中使用 pandas 作为:

import json
import pandas as pd
import re
df = pd.read_json("data.json",lines=True)

#I tried to change the pattern of data in details column as

re1 = re.compile('r/|(.?):(.?)|/')
re2 = re.compile('r\"(.*?)\":\"(.*?)\"')

df.replace({'details' :re1}, {'details' : re2},inplace = True, regex = True);


标签: pythonjsonregexpandasdesign-patterns


You can convert the (list of) dictionaries to a pandas data frame.

import pandas as pd

# data is a list of dictionaries
data = [{
    "fruit": "Apple",
    "size": "Large",
    "color": "Red",

    "fruit": "Almond",
    "size": "small",
    "color": "brown",


# convert to data frame
df = pd.DataFrame(data)

# remove '|' from details and convert to list
df['details'] = df['details'].str.replace(r'\|', '').str.split(',')

# explode list => one row for each element
df = df.explode('details')

# split details into name/value pair
df[['name', 'value']] = df['details'].str.split(':').apply(lambda x: pd.Series(x))

# drop details column
df = df.drop(columns='details')


    fruit   size  color       name value
0   Apple  Large    Red   seedless  true
0   Apple  Large    Red  condition   New
1  Almond  small  brown       Type   dry
1  Almond  small  brown   seedless  true
1  Almond  small  brown  condition   New
