首页 > 解决方案 > 如何将嵌套 JSON 格式化为迭代过程格式

问题描述

我有一个嵌套 JSON 格式的进程树,我试图将它变成一个带有列表的迭代进程字典。例如,嵌套树如下:

{
    "name": "test",
    "children": [
        {
            "name": "Operator_8a82e",
            "children": [
                {
                    "name": "Link_e5479",
                    "children": [
                        {
                            "name": "Operator_b7394",
                            "children": [
                                {
                                    "name": "Link_7f62e",
                                    "children": [
                                        {
                                            "name": "Operator_73ea0",
                                            "children": [
                                                {
                                                    "name": "Link_93a51",
                                                    "children": [
                                                        {
                                                            "name": "Operator_32a07"
                                                        }
                                                    ]
                                                }
                                            ]
                                        }
                                    ]
                                }
                            ]
                        }
                    ]
                },
                {
                    "name": "Link_59e2c",
                    "children": [
                        {
                            "name": "Operator_3ca6d"
                        }
                    ]
                }
            ]
        }
    ]
}

我希望它看起来像下面这样。基本上每个子树都放在一个迭代列表中(按照它在嵌套 JSON 中出现的顺序。这非常重要)。

{
  "process_1": [
    {
      "name": "Operator_8a82e"
    },
    {
      "name": "Link_e5479"
    },
    {
      "name": "Operator_b7394"
    }
  ],
  "process_2": [
    {
      "name": "Operator_8a82e"
    },
    {
      "name": "Link_59e2c"
    }
  ]
}

我目前的功能几乎可以让我到达那里。生病解释为什么它不能在下面完全工作。

def flatten_json(y):
    out = {}
    def flatten(x):
        i = 0
        if type(x) is dict:
            for a in x:
                flatten(x[a])
        elif type(x) is list:
            for a in x:
                i += 1
                flatten(a)
        else:
            print(x)

    flatten(y)
    return out

这将返回以下内容。但是我似乎无法区分另一棵子树何时结束(以及一个新子树何时开始)

test
Operator_8a82e
Link_e5479
Operator_b7394
Link_7f62e
Operator_73ea0
Link_93a51
Operator_32a07
Link_59e2c
Operator_3ca6d

例如,理想情况下,输出如下所示:

test
Operator_8a82e
Link_e5479
Operator_b7394
Link_7f62e
Operator_73ea0
Link_93a51
Operator_32a07
Operator_8a82e # this is what is missing in my function above 
Link_59e2c
Operator_3ca6d

任何帮助都会很棒!

标签: python

解决方案


这是树格式的图形数据。您可以将其加载为networkx图表,然后以所需格式再次导出。假设您已将 json 加载为名为的 python 字典data

from networkx.readwrite import json_graph
G = json_graph.tree_graph(data, ident="name")

现在让我们首先找到所有汇节点(没有出边的节点),然后从定义的源中找到简单路径:

#define source node
source = 'Operator_8a82e'
#get a list of sink nodes
sinks = [node for node in G.nodes if G.out_degree(node) == 0]
#get all simple paths from source to sinks
paths = [list(nx.all_simple_paths(G, source=source, target=sink)) for sink in sinks]
#get first path since there is only one
paths = [i[0] for i in paths if i]
#create dict
[{f'process_{n+1}': [{'name':i} for i in path]} for n, path in enumerate(paths)]

结果:

[{'process_1': [{'name': 'Operator_8a82e'},
   {'name': 'Link_e5479'},
   {'name': 'Operator_b7394'},
   {'name': 'Link_7f62e'},
   {'name': 'Operator_73ea0'},
   {'name': 'Link_93a51'},
   {'name': 'Operator_32a07'}]},
 {'process_2': [{'name': 'Operator_8a82e'},
   {'name': 'Link_59e2c'},
   {'name': 'Operator_3ca6d'}]}]

推荐阅读