Neo4j Cypher和节点属性访问



我有一些带有预聚合数据的统计节点,这些节点有时每个节点都有50-100k个属性。我意识到这很疯狂,闻起来很难闻,但这是我在为满足业务需求优化性能方面所做的最好的努力。

我在这些节点(childD(上实现了以下过滤逻辑:

WHERE apoc.coll.containsAllSorted($profileDetailedCriterionIds, childD.detailedCriterionIds) 
UNWIND childD.detailedCriterionIds AS mCId 
WITH childD, mCId 
WHERE 
(childD['criterionAvgVoteWeights.' + mCId] = 0 OR childD['criterionAvgVoteWeights.' + mCId] <= $profileCriterionAvgVoteWeights[toString(mCId)]) 
AND (childD['criterionExperienceMonths.' + mCId] = 0 OR childD['criterionExperienceMonths.' + mCId] <= $profileCriterionExperienceMonths[toString(mCId)]) 
WITH DISTINCT childD

正如您所看到的,对于每个mCId,都有一个预聚合的属性,我通过它的名称来访问它。

所以,我的问题是关于Neo4j节点属性的底层实现。它们是否存储在一些关键值存储中?我是否可以假设,当我按它们的确切名称(如上面的查询中所示(引用它们时,无论节点中的属性数量如何,都会在同一时间访问它们?

更新

MATCH (childD)-[:HAS_VOTE_ON]->(pc:Criterion)
WHERE pc.id IN $vacancyDetailedCriterionIds
WITH childD, collect(DISTINCT pc.id) as profileCriterionIds
WHERE size(profileCriterionIds) >= size($vacancyDetailedCriterionIds)
WITH childD
MATCH (childD)-[:CONTAINS]->(childDStat:JobableStatistic)
UNWIND $vacancyDetailedCriterionIds AS mCId
WITH childD, childDStat, mCId
WHERE 
($vacancyCriterionAvgVoteWeights[toString(mCId)] = 0 OR $vacancyCriterionAvgVoteWeights[toString(mCId)] <= childDStat['criterionAvgVoteWeights.' + mCId])
AND 
($vacancyCriterionExperienceMonths[toString(mCId)] = 0 OR $vacancyCriterionExperienceMonths[toString(mCId)] <= childDStat['criterionExperienceMonths.' + mCId])

摘自文章:

Data stored on disk is all linked lists of fixed size records. 
Properties are stored as a linked list of property records, each holding a key and value 
and pointing to the next property. Each node and relationship references its first property 
record. The Nodes also reference the first relationship in its relationship chain.
Each Relationship references its start and end node. 
It also references the previous and next relationship record for the start and end node 
respectively.

这意味着,如果某个属性在链接列表中的位置相同,则访问该属性的时间将保持不变。现在,在删除属性的过程中,访问时间可能会减少,因为某些元素会发生偏移。在插入过程中,据我所知,插入发生在列表的末尾,所以是的,特定属性的位置保持不变,因此访问它的时间是恒定的。

您可以自己检查插入,尝试这些查询,然后查看节点的属性,它们按照插入的顺序排列:

MATCH (s:Sample) set s.a = 2 return s
MATCH (s:Sample) set s.b = 2 return s
MATCH (s:Sample) set s.c = 2 return s

简而言之,属性的访问时间将取决于它在链表中的位置,因此不同的属性将有不同的访问时间。但对于一个属性,多次运行查询的访问时间是相同的。

最新更新