首页 > 解决方案 > 如何在 Python 中找到一棵树(或多棵连接的树)上的所有路径?

问题描述

我正在尝试使用 Python 从一个或多个连接的树中查找所有路径,例如,如果我的数据是:

a=pd.DataFrame({'predecessor':[1,2,1,4,5,5,5,7,7,10,11,11,8,8,8,14,14,14,16,16,21,16,15,15],
                'successor':[2,3,4,5,6,7,8,9,10,11,12,13,17,18,19,8,15,16,20,21,23,22,19,20]})

前任和继任者意味着这两个数字是相连的。所以,我使用这些数据的树看起来像:

在此处输入图像描述

我想要的是所有的路径。一条路径类似于 [1,2,3] 或 [1,4,5,7,10,11,13]。我的真实数据很大,所以使用数据框来存储所有路径并不是一个好主意。也许列表列表(其中每个子列表都是完整路径)很有用。我希望结果是这样的:

[[1,2,3], 
 [1,4,5,7,10,11,13],
 [14,8,17],
 [14,16,21,23],
 ......]

那么,有人可以在这里帮助我吗?

标签: pythonlistsearchpathtree

解决方案


import pandas as pd


a = pd.DataFrame({'predecessor': [1, 2, 1, 4, 5, 5, 5, 7, 7, 10, 11, 11, 8, 8, 8, 14, 14, 14, 16, 16, 21, 16, 15, 15],
                  'successor': [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 17, 18, 19, 8, 15, 16, 20, 21, 23, 22, 19, 20]})

# loop to store all the parent-child nodes and find out the root nodes and end nodes.
# A root node is a node only in 'predecessor' but not in 'successor'
# An end node is a node only in 'successor' but not in 'predecessor'
root_nodes = set()
end_nodes = set()
node_relations = {}
for i in range(len(a['predecessor'])):
    predecessor = a['predecessor'][i]
    successor = a['successor'][i]
    if predecessor not in node_relations.keys():
        node_relations[predecessor] = []
    node_relations[predecessor].append(successor)
    if predecessor not in a['successor'].values:
        root_nodes.add(predecessor)
    if successor not in a['predecessor'].values:
        end_nodes.add(successor)

# DFS + Memorization
def get_routes(root, memory):
    # when already in memory
    if root in memory.keys():
        return memory[root]
    # when it is the end node, return node itself as the routes
    if root in end_nodes:
        memory[root] = [[root]]
        return memory[root]
    # Loop all the successor routes and add root node before all of them
    memory[root] = []
    for successor in node_relations[root]:
        for route in get_routes(successor, memory):
            memory[root].append([root] + route)
    return memory[root]

# Loop from root nodes
memory = {}
result = []
for root in root_nodes:
    result.extend(get_routes(root, memory))


推荐阅读