首页 > 解决方案 > 从源目标权重数据框到 JSON 文件

问题描述

我有这个源、目标和权重数据框:

    source            target  weight
0     A                  B       3
1     A                  C       2
2     B                  C       0
3     C                  D       1
4     D                  A       1
5     D                  B       1
...            

如何获得如下所示的 JSON 文件:

{
  "nodes": [
    {"id": "A"},
    {"id": "B"},
    {"id": "C"},
    {"id": "D"}
    ],
   "links": [
    {"source": "A", "target": "B", "weight": 3},
    {"source": "A", "target": "C", "weight": 2},
    {"source": "B", "target": "C", "weight": 0},
    {"source": "C", "target": "D", "weight": 1},
    {"source": "D", "target": "A", "weight": 1},
    {"source": "D", "target": "B", "weight": 1}
    ]
}
    

我可以通过循环和列表来重建它,但是有更简单的方法吗?

标签: pythonjsonpandas

解决方案


nodes可以从源和目标中的唯一值(通过np.unique)构建,然后links可以从构建DataFrame.to_dict

import numpy as np
import pandas as pd

df = pd.DataFrame({
    'source': ['A', 'A', 'B', 'C', 'D', 'D'],
    'target': ['B', 'C', 'C', 'D', 'A', 'B'],
    'weight': [3, 2, 0, 1, 1, 1]
})

data = {
    'nodes': [{'id': v} for v in np.unique(df[['source', 'target']])],
    'links': df.to_dict(orient='records')
}

data

{
    'nodes': [{'id': 'A'}, {'id': 'B'}, {'id': 'C'}, {'id': 'D'}],
    'links': [{'source': 'A', 'target': 'B', 'weight': 3},
              {'source': 'A', 'target': 'C', 'weight': 2},
              {'source': 'B', 'target': 'C', 'weight': 0},
              {'source': 'C', 'target': 'D', 'weight': 1},
              {'source': 'D', 'target': 'A', 'weight': 1},
              {'source': 'D', 'target': 'B', 'weight': 1}]
}

根据要求,networkx也有对此的支持,json_graph.node_link_data除非需要额外的 Graph 操作,否则这肯定是大材小用:

import networkx as nx
import pandas as pd
from networkx.readwrite import json_graph

df = pd.DataFrame({
    'source': ['A', 'A', 'B', 'C', 'D', 'D'],
    'target': ['B', 'C', 'C', 'D', 'A', 'B'],
    'weight': [3, 2, 0, 1, 1, 1]
})

G = nx.from_pandas_edgelist(df, source='source',
                            target='target',
                            edge_attr='weight')
data = json_graph.node_link_data(G)

data

{'directed': False,
 'graph': {},
 'links': [{'source': 'A', 'target': 'B', 'weight': 3},
           {'source': 'A', 'target': 'C', 'weight': 2},
           {'source': 'A', 'target': 'D', 'weight': 1},
           {'source': 'B', 'target': 'C', 'weight': 0},
           {'source': 'B', 'target': 'D', 'weight': 1},
           {'source': 'C', 'target': 'D', 'weight': 1}],
 'multigraph': False,
 'nodes': [{'id': 'A'}, {'id': 'B'}, {'id': 'C'}, {'id': 'D'}]}

推荐阅读