我在一组节点上运行PageRank,其中每个节点都有一个属性year
。如何根据year
属性计算所有PageRank分数的平均值?也就是说,如果有100个节点总共有20个不同的year
值,我想计算20个平均PageRank值。
然后,对于每个节点,我想根据当年论文的PageRank分数和平均PageRank得分之间的差异来计算一个缩放分数(其中,当年的平均值是基于year
属性具有相同值的所有节点的PageRanc分数。
运行PageRank的代码是:CALL algo.pageRank.stream(
'MATCH (p:Paper) WHERE p.year < 2015 RETURN id(p) as id',
'MATCH (p1:Paper)-[:CITES]->(p2:Paper) RETURN id(p1) as source, id(p2) as target',
{graph:'cypher', iterations:20, write:false, concurrency:20})
YIELD node, score
WITH
*,
node.title AS title,
node.year AS year,
score AS page_rank
ORDER BY page_rank DESC
LIMIT 10000
RETURN
title,
year,
page_rank;
如何更改此代码以返回缩放后的分数?
非常感谢您的帮助!
此查询应返回每个year
/title
组合的scaled_score
(作为绝对值((缩放分数越低,标题的page_rank
越接近当年的平均值(:
CALL algo.pageRank.stream(
'MATCH (p:Paper) WHERE p.year < 2015 RETURN id(p) as id',
'MATCH (p1:Paper)-[:CITES]->(p2:Paper) RETURN id(p1) as source, id(p2) as target',
{graph:'cypher', iterations:20, write:false, concurrency:20})
YIELD node, score
WITH
node.title AS title,
node.year AS year,
score AS page_rank
ORDER BY page_rank DESC
LIMIT 10000
WITH year, COLLECT({title: title, page_rank: page_rank}) AS data, AVG(page_rank) AS avg_page_rank
UNWIND data AS d
RETURN year, d.title AS title, ABS(d.page_rank-avg_page_rank)/avg_page_rank AS scaled_score;
您可能还需要对结果进行排序(例如,按year
或scaled_score
排序(。