基于Neo4j中节点属性的Jaccard相似性创建节点之间的关系



我的Neo4j图中有多个节点。我想在任意两个节点之间创建关系,当且仅当它们在属性上的Jaccard相似性高于某个阈值alpha。

考虑2个节点:

Node 1: {id:1, abc: 1.1, eww: -9.4, ssv: "likj"}
Node 2: {id:2, we2: 1, eww: 900}
Node 3: {id:3, kuku: -91, lulu: 383, ssv: "bubu"}

因此Node1和Node2在属性上的Jaccard相似性为:(交点=(2/(并集=(5=0.4

如何在Neo4j中执行此操作?我知道有一个Jaccard相似性函数,但如何配置它来处理节点的属性?

假设你指的是属性存在的Jaccard相似性,那么你可以做这样的

MATCH (a:Node)
MATCH (b:Node) WHERE id(b) > id(a)
WITH a, b, [prop IN keys(a) WHERE prop IN keys(b)] AS shared_properties // Find the properties that exist on both nodes using the IN operator
WITH a, b, size(shared_properties) AS shared_property_count // Get the number of shared properties 
WITH 1.0*shared_property_count / size(apoc.coll.union(keys(a), keys(b))) AS jaccard_similarity, a, b // Compute the Jaccard similarity as the intersection over the union
WHERE jaccard_similarity > $threshold // Make sure the similarity is higher than some threshold
CREATE (a)-[:SIMILAR_TO {jaccard: jaccard_similarity}]->(b) 

WITH语句找到两个节点上存在的属性并对其进行计数,最后我们找到Jaccard相似性。

最新更新