neo4j 根据一跳邻居节点的属性计算频率嵌入

hrysbysz  于 2022-11-05  发布在  其他
关注(0)|答案(2)|浏览(200)

我有一个如下的图表。

有两种类型的节点:AB。标记为A的节点具有属性id。标记为B的节点具有属性State。假设State只能属于[good, bad, average]
如何使用密码对节点A进行频率嵌入?例如,A1应该有一个属性embedding = [2,0,0],而A2应该有一个属性embedding = [2,0,1]

b09cbbtk

b09cbbtk1#

一种方法是:

Match(n:A)-->(t:B)
WITH collect(t.state) AS state, n
WITH [val IN state WHERE val = "good"] AS good, [val IN state WHERE val = "average"] AS avg, [val IN state WHERE val = "bad"] AS bad, n
WITH size(good) as goodCount, size(avg) as avgCount, size(bad) as badCount, n
SET n.embedding = [goodCount, badCount, avgCount]
return n.key, n.embedding

这会找到所有的A和它们的B,将B的状态收集到一个列表中,然后为每个状态创建不同的列表。接下来,我们会根据顺序得到每个状态列表的大小和SETembedding值。最后一部分是返回Aembedding
您可以在以下示例数据上进行检查:

MERGE (a:A{key: 1})
MERGE (b:A{key: 2})
MERGE (c:B{key: 3, state: 'good'})
MERGE (d:B{key: 4, state: 'good'})
MERGE (e:B{key: 5, state: 'average'})
MERGE (f:B{key: 6, state: 'good'})
MERGE (g:B{key: 7, state: 'good'})

MERGE (a)-[:HAS]-(c)
MERGE (a)-[:HAS]-(d)
MERGE (b)-[:HAS]-(e)
MERGE (b)-[:HAS]-(f)
MERGE (b)-[:HAS]-(g)
  • 使用key而不是id

它会传回:

╒═══════╤═════════════╕
│"n.key"│"n.embedding"│
╞═══════╪═════════════╡
│1      │[2,0,0]      │
├───────┼─────────────┤
│2      │[2,0,1]      │
└───────┴─────────────┘

如果您有许多状态选项,则可以执行以下操作:

MATCH(n:A)-->(t:B)
WITH apoc.coll.indexOf(["good", "bad", "average"], t.state) as inx, n, [0,0,0] as k
WITH apoc.coll.set(k, inx, 1) AS k, n
WITH collect(k) as kk, n
WITH REDUCE(s = [], sublist IN kk | CASE
    WHEN SIZE(s) = 0 THEN sublist
    ELSE [i IN RANGE(0, SIZE(s)-1) | s[i] + sublist[i]]
    END) AS result, n
SET n.embedding = result
RETURN n.key, n.embedding

灵感源自this answer by @cybersam

pw9qyyiw

pw9qyyiw2#

为了将其扩展到更多的州类别,我们可以在尼姆罗德的解决方案的基础上进行构建。
使用您类别设置引用节点集:

create (n1:ref{name:'good',order:1})
create (n2:ref{name:'bad',order:2})
create (n3:ref{name:'average',order:3})

则查询变为

match (c:ref) 
with c order by c.order
with collect(c.name) as cn
MATCH(n:A)-->(t:B)
WITH apoc.coll.indexOf(cn, t.state) as inx, n, [0,0,0] as k
WITH apoc.coll.set(k, inx, 1) AS k, n
WITH collect(k) as kk, n
WITH REDUCE(s = [], sublist IN kk | CASE
    WHEN SIZE(s) = 0 THEN sublist
    ELSE [i IN RANGE(0, SIZE(s)-1) | s[i] + sublist[i]]
    END) AS result, n
SET n.embedding = result
RETURN n.key, n.embedding

也可以将查询作为

with ['good','bad', 'average'] as cn

在任何一种情况下,您都可以根据需要添加任意多个类别

相关问题