Neo4j批量处理json文件

cgh8pdjw 于 2024-01-07 发布在其他

关注(0)|答案(1)|浏览(212)

我有一个cypher来读取json文件，并做一些处理来创建/合并节点和关系，

// Read the json file
CALL apoc.load.json('file:///test/test.json') YIELD value

// Read the json fields as variables
with value.globalcustid as v_globalcustid, value.member_node_properties as member_node_properties, value.relationships as relationships

// Update the member node demographic properties, if the member node does not exist, create it
MERGE (m:Member {globalcustid: v_globalcustid})
ON CREATE SET m.age_group=member_node_properties.age_group, m.gender=member_node_properties.gender, m.education=member_node_properties.education
ON MATCH SET m.age_group=member_node_properties.age_group, m.gender=member_node_properties.gender, m.education=member_node_properties.education

// Unwind the relationships array into multiple rows of relationship
with v_globalcustid, m, member_node_properties, relationships
UNWIND relationships as r

// Get the tag node and skip the null tag
with m, v_globalcustid, r, r.node.label as node_label, r.node.tag_id as tag_id
where tag_id is not null
CALL apoc.cypher.run(
    "MATCH (t:" + node_label + " {tag_id: '" + tag_id +"'}) RETURN t limit 1", 
    {}
) yield value as tag_node

// Merge the relationship of member node and tag node
with m, v_globalcustid, r, tag_node.t as tag_node
CALL apoc.merge.relationship(
  m, 
  r.relationship.label,
  {createdate: r.relationship.createdate, enddate: r.relationship.enddate, value: r.relationship.value},
  {},
  tag_node,
  {}
) YIELD rel

return *

字符串
cypher在一个小的json文件下工作得很好，然后我发现当json文件太大时，它会出现内存不足的错误，所以我决定把它放在批处理中。

:auto

// Read the json file
CALL apoc.load.json('file:///test/test.json') YIELD value

// Read the json fields as variables
with value.globalcustid as v_globalcustid, value.member_node_properties as member_node_properties, value.relationships as relationships

call {
with v_globalcustid, member_node_properties, relationships

// Update the member node demographic properties, if the member node does not exist, create it
MERGE (m:Member {globalcustid: v_globalcustid})
ON CREATE SET m.age_group=member_node_properties.age_group, m.gender=member_node_properties.gender, m.education=member_node_properties.education
ON MATCH SET m.age_group=member_node_properties.age_group, m.gender=member_node_properties.gender, m.education=member_node_properties.education

// Unwind the relationships array into multiple rows of relationship
with v_globalcustid, m, member_node_properties, relationships
UNWIND relationships as r

// Get the tag node and skip the null tag
with m, v_globalcustid, r, r.node.label as node_label, r.node.tag_id as tag_id
where tag_id is not null
CALL apoc.cypher.run(
    "MATCH (t:" + node_label + " {tag_id: '" + tag_id +"'}) RETURN t limit 1", 
    {}
) yield value as tag_node

// Merge the relationship of member node and tag node
with m, v_globalcustid, r, tag_node.t as tag_node
CALL apoc.merge.relationship(
  m, 
  r.relationship.label,
  {createdate: r.relationship.createdate, enddate: r.relationship.enddate, value: r.relationship.value},
  {},
  tag_node,
  {}
) YIELD rel

return *
} in transactions of 10000 rows

型
但是看起来在call {}块中调用另一个过程是不起作用的。

Query cannot conclude with CALL together with YIELD (line 31, column 1 (offset: 1319))
"CALL apoc.merge.relationship("
 ^

型

neo4j

来源：https://stackoverflow.com/questions/77651129/neo4j-process-a-json-file-in-batch

1条答案

按热度按时间

uklbhaso1#

你的第二个查询应该失败，因为RETURN *正在投射已经在外部作用域中声明的变量，例如v_globalcustid。
要解决你原来的问题，只需返回null：

CALL {
  // other stuff
  CALL apoc.merge.relationship(
    m, 
    r.relationship.label,
    { 
      createdate: r.relationship.createdate, 
      enddate: r.relationship.enddate, 
      value: r.relationship.value
    },
    {},
    tag_node,
    {}
  ) YIELD rel
  RETURN null AS discard
} IN TRANSACTIONS OF 10000 ROWS
RETURN null

字符串
这是必要的，因为CALL子查询必须以RETURN或写操作（如CREATE）结束。尽管apoc.merge.relationship确实执行写操作，但Cypher将其视为必须返回值的读操作。

赞(0）回复(0）举报 2024-01-07

我来回答

Neo4j批量处理json文件

1条答案

相关问题

热门标签

最新问答