首页 > 解决方案 > 如何在 Cypher(Neo4j)中通过给定的面包屑字符串获取路径?

问题描述

初始情况

图形可视化

CREATE
    (root:Root {name:'Root'}),
    (dirA:Directory {name:'dir A'}),
    (dirB:Directory {name:'dir B'}),
    (dirC:Directory {name:'dir C'}),
    (dirD:Directory {name:'dir D'}),
    (dirE:Directory {name:'dir E'}),
    (dirF:Directory {name:'dir F'}),
    (dirG:Directory {name:'dir G'}),
    (root)-[:CONTAINS]->(dirA),
    (root)-[:CONTAINS]->(dirB),
    (dirA)-[:CONTAINS]->(dirC),
    (dirA)-[:CONTAINS]->(dirD),
    (dirD)-[:CONTAINS]->(dirE),
    (dirD)-[:CONTAINS]->(dirF),
    (dirD)-[:CONTAINS]->(dirG);

给定输入参数

例子:

WITH 'dir A/dir D/dir G' as inputString
WITH split(inputString, '/') AS directories
UNWIND
    directories AS directory
RETURN
    directory;

╒═══════════╕
│"directory"│
╞═══════════╡
│"dir A"    │
├───────────┤
│"dir D"    │
├───────────┤
│"dir G"    │
└───────────┘

待解决的挑战

对于指定的面包屑字符串(“dir A/dir D/dir G”),我需要它在 Cypher 中的表示路径,这将是更复杂查询的一部分。我不能只在树中搜索面包屑的最后一个目录条目(“dir G”),因为目录名称不是唯一的。如何在 Cypher 中实现我的请求?

预期结果:

╒═══════════════════════════════════════════════════════════════════════════════════════════════════════════════╕
│"path"                                                                                                         │
╞═══════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│[{"name":"Root"},{},{"name":"dir A"},{"name":"dir A"},{},{"name":"dir D"},{"name":"dir D"},{},{"name":"dir G"}]│
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────┘

标签: neo4jcypherneo4j-apoc

解决方案


对于这种情况,我建议让每个 :Directory 节点都将完整路径作为属性,这样可以更轻松地匹配目录及其路径:

MATCH path = (:Root)-[:CONTAINS*]->(d:Directory)
WITH d, [node in tail(nodes(path)) | node.name] as directories
WITH d, apoc.text.join(directories, '/') as pathString
SET d.path = pathString

(如果目录在树中移动,您可以使用类似的查询来更新目录(及其子目录))

使用此设置,即使您没有提供感兴趣路径上方的路径部分,也可以轻松匹配路径的结束节点(您没有提到您提供的路径是否始终从根扩展或者如果它只是路径的尾端):

WITH 'dir A/dir D/dir G' as inputString
MATCH (end:Directory)
WHERE end.path ENDS WITH inputString
RETURN end

因此,如果:DIRECTORY(path)被索引,那么您可以快速访问结束节点。现在去找其他人。

我们可以使用变长路径表达式来查找这些节点的完整路径,使用all()谓词确保路径中的每个节点都有来自拆分输入的名称,并在扩展期间进行检查。这为我们提供了我们想要的节点的路径(只浪费了一次额外的遍历到上面的父节点),但它不能保证顺序,我们必须在之后过滤。

这应该适用于您的示例图:

WITH 'dir A/dir D/dir G' as inputString
WITH inputString, split(inputString, '/') as dirNames
MATCH (end:Directory)
WHERE end.path ENDS WITH inputString
MATCH path = (start)-[:CONTAINS*]->(end)
WHERE all(node in nodes(path) WHERE node.name IN dirNames)
WITH path
WHERE length(path) + 1 = size(dirNames) AND [node in nodes(path) | node.name] = dirNames
RETURN path

推荐阅读