首页 > 解决方案 > graphDatabase 上的聚合/GroupCount

问题描述

我在 gremlin 中有一个图形数据库,其形状如下图:

图形

我需要帮助来构建查询以获取所有“人员”之间的结果,边缘作为所有“事件”的共同计数。结果应该是这样的:

{
    nodes: [
      {id:"PersonA", label: "Person A"},
      {id:"PersonB", label: "Person B"},
      {id:"PersonC", label: "Person C"},
      {id:"PersonD", label: "Person D"},
      {id:"PersonE", label: "Person E"},
      {id:"PersonF", label: "Person F"},
    ],
   edges: [
      {from: "PersonA", to: "PersonB", label: 1},
      {from: "PersonA", to: "PersonC", label: 2},
      {from: "PersonA", to: "PersonD", label: 2},
      {from: "PersonA", to: "PersonE", label: 1},
      {from: "PersonA", to: "PersonF", label: 1},
      {from: "PersonB", to: "PersonC", label: 1},
      {from: "PersonB", to: "PersonD", label: 1},
      {from: "PersonC", to: "PersonD", label: 2},
      {from: "PersonC", to: "PersonE", label: 1},
      {from: "PersonC", to: "PersonF", label: 1},
      {from: "PersonD", to: "PersonE", label: 1},
      {from: "PersonD", to: "PersonF", label: 1},
      {from: "PersonE", to: "PersonF", label: 1}
   ]
}

我为此苦苦挣扎了几个小时,并且无法找到我正在寻找的东西。

标签: gremlingraph-databasesgremlin-serverazure-cosmosdb-gremlinapi

解决方案


图片很好,但在询问有关 Gremlin 的问题时,最好提供 Gremlin 脚本来创建数据:

g.addV('person').property(id,'a').as('a').         
  addV('person').property(id,'b').as('b').
  addV('person').property(id,'c').as('c').
  addV('person').property(id,'d').as('d').  
  addV('person').property(id,'e').as('e').
  addV('person').property(id,'f').as('f').  
  addV('event').property(id,'1').as('1').   
  addV('event').property(id,'2').as('2').  
  addE('attends').from('a').to('1').
  addE('attends').from('a').to('2').
  addE('attends').from('b').to('2').
  addE('attends').from('c').to('1').
  addE('attends').from('c').to('2').
  addE('attends').from('d').to('1').
  addE('attends').from('d').to('2').
  addE('attends').from('e').to('1').
  addE('attends').from('f').to('1').iterate()

我采用这种方法来解决您的问题:

g.V().hasLabel('person').as('s').
  out().in().
  where(neq('s')).
  path().by(id).
  groupCount().
    by(union(limit(local,1),tail(local,1)).fold()).
  unfold().
  dedup().
    by(select(keys).order(local)).
  order().
    by(select(keys).limit(local,1)).
    by(select(keys).tail(local,1))

这会产生您寻求的输出:

gremlin> g.V().hasLabel('person').as('s').
......1>   out().in().
......2>   where(neq('s')).
......3>   path().by(id).
......4>   groupCount().
......5>     by(union(limit(local,1),tail(local,1)).fold()).
......6>   unfold().
......7>   dedup().
......8>     by(select(keys).order(local)).
......9>   order().by(select(keys).limit(local,1))
==>[a, b]=1
==>[a, e]=1
==>[a, c]=2
==>[a, d]=2
==>[a, f]=1
==>[b, c]=1
==>[b, d]=1
==>[c, d]=2
==>[c, e]=1
==>[c, f]=1
==>[d, e]=1
==>[d, f]=1
==>[e, f]=1

上述方法path()用于收集 Gremlin 所经过的“person->event<-person”,并避免使用where(neq('s')). 然后它groupCount()通过代表人对的“人”顶点来执行。我们现在可以根据需要Map使用人员对及其计数,但它需要一些后期处理,因此我们unfold()使用Map键值对。第一步是dedup()按人对,因为Map当前包含诸如“a->b”和“b->a”之类的东西,我们不需要两者,所以通过这些对的有序列表进行重复数据删除,将为我们提供唯一的列表。最后,我们添加一些order()以使结果看起来与您的完全一样。

我想你可以尝试在dedup()之后立即path()避免一些额外的工作groupCount()


推荐阅读