首页 > 解决方案 > 如何将 DataFrame 转换为嵌套的 JSON

问题描述

我正在尝试使用仅适用于一个级别(父级、子级)的解决方案将数据帧导出到 D3.js 的嵌套 JSON(分层)中

任何帮助,将不胜感激。我是 python 新手

我的 DataFrame 包含 7 个级别这是预期的解决方案


JSON Example:
    {
    "name": "World",
    "children": [
        {
            "name": "Europe",
            "children": [
                {
                    "name": "France",
                    "children": [
                        {
                             "name": "Paris",
                             "population": 1000000
                         }]
                 }]
          }]
     }

这是python方法:


def to_flare_json(df, filename):
    """Convert dataframe into nested JSON as in flare files used for D3.js"""
    flare = dict()
    d = {"name":"World", "children": []}

    for index, row in df.iterrows():
        parent = row[0]
        child = row[1]
        child1 = row[2]
        child2 = row[3]
        child3 = row[4]
        child4 = row[5]
        child5 = row[6]
        child_value = row[7]

        # Make a list of keys
        key_list = []
        for item in d['children']:
            key_list.append(item['name'])

        #if 'parent' is NOT a key in flare.JSON, append it
        if not parent in key_list:
            d['children'].append({"name": parent, "children":[{"value": child_value, "name1": child}]})
        # if parent IS a key in flare.json, add a new child to it
        else:
            d['children'][key_list.index(parent)]['children'].append({"value": child_value, "name11": child})
    flare = d
    # export the final result to a json file
    with open(filename +'.json', 'w') as outfile:
        json.dump(flare, outfile, indent=4,ensure_ascii=False)
    return ("Done")

[编辑]

这是我的 df 示例

World   Continent   Region  Country     State   City    Boroughs    Population
1   Europe  Western Europe  France  Ile de France   Paris   17  821964
1   Europe  Western Europe  France  Ile de France   Paris   19  821964
1   Europe  Western Europe  France  Ile de France   Paris   20  821964

标签: pythonjsonpandasdataframed3.js

解决方案


你想要的结构显然是递归的,所以我做了一个递归函数来填充它:

def create_entries(df):
    entries = []
    # Stopping case
    if df.shape[1] == 2:  # only 2 columns left
        for i in range(df.shape[0]):  # iterating on rows
            entries.append(
                {"Name": df.iloc[i, 0],
                 df.columns[-1]: df.iloc[i, 1]}
            )
    # Iterating case
    else:
        values = set(df.iloc[:, 0])  # Getting the set of unique values
        for v in values:
            entries.append(
                {"Name": v,
                 # reiterating the process but without the first column
                 # and only the rows with the current value
                 "Children": create_entries(
                     df.loc[df.iloc[:, 0] == v].iloc[:, 1:]
                 )}
            )
    return entries

剩下的就是创建字典并调用函数:

mydict = {"Name": "World",
          "Children": create_entries(data.iloc[:, 1:])}

然后你只需将你的 dict 写入一个 JSON 文件。

我希望我的评论足够明确,想法是递归地使用数据集的第一列作为“名称”,其余的作为“孩子”。


推荐阅读