首页 > 解决方案 > Neo4j 如何返回每个属性值的顶部节点

问题描述

我在一组类型的节点上运行 PageRank Paper,其中每个节点都有一个属性year。我目前正在使用当年所有论文的 PageRank 分数的平均值和标准差对每个 PageRank 分数进行标准化。

我想返回每年的前 100 篇论文(基于缩放的 PageRank 值)。我可以在单个查询中执行此操作吗?

下面的查询计算比例分数并返回总体排名前 100 的结果,而不是每年的前 100 名:

CALL algo.pageRank.stream(
  'MATCH (p:Paper) WHERE p.year < 2015 RETURN id(p) as id',
  'MATCH (p1:Paper)-[:CITES]->(p2:Paper) RETURN id(p1) as source, id(p2) as target',
  {graph:'cypher', iterations:20, write:false, concurrency:20})
YIELD node, score
WITH 
  node.title AS title,
  node.year AS year, 
  score AS page_rank
ORDER BY page_rank DESC
LIMIT 100
WITH year, COLLECT({title: title, page_rank: page_rank}) AS data, AVG(page_rank) AS avg_page_rank, stDev(page_rank) as stdDev
UNWIND data AS d
RETURN year, d.title AS title, ABS(d.page_rank-avg_page_rank)/stdDev AS scaled_score;

任何建议将不胜感激!

标签: neo4jcypherpagerank

解决方案


试试这个:

CALL algo.pageRank.stream(
  'MATCH (p:Paper) WHERE p.year < 2015 RETURN id(p) as id',
  'MATCH (p1:Paper)-[:CITES]->(p2:Paper) RETURN id(p1) as source, id(p2) as target',
  {graph:'cypher', iterations:20, write:false, concurrency:20})
YIELD node, score
WITH 
  node.title AS title,
  node.year AS year, 
  score AS page_rank
ORDER BY page_rank DESC
WITH year, COLLECT({title: title, page_rank: page_rank})[..100] AS data, AVG(page_rank) AS avg_page_rank, stDev(page_rank) as stdDev
UNWIND data AS d
RETURN year, d.title AS title, ABS(d.page_rank-avg_page_rank)/stdDev AS scaled_score;

此查询删除了该LIMIT子句,而是保留了data每年前 100 个(排序的)项目。


推荐阅读