Neo4j批量处理json文件

cgh8pdjw  于 12个月前  发布在  其他
关注(0)|答案(1)|浏览(198)

我有一个cypher来读取json文件,并做一些处理来创建/合并节点和关系,

// Read the json file
CALL apoc.load.json('file:///test/test.json') YIELD value

// Read the json fields as variables
with value.globalcustid as v_globalcustid, value.member_node_properties as member_node_properties, value.relationships as relationships

// Update the member node demographic properties, if the member node does not exist, create it
MERGE (m:Member {globalcustid: v_globalcustid})
ON CREATE SET m.age_group=member_node_properties.age_group, m.gender=member_node_properties.gender, m.education=member_node_properties.education
ON MATCH SET m.age_group=member_node_properties.age_group, m.gender=member_node_properties.gender, m.education=member_node_properties.education

// Unwind the relationships array into multiple rows of relationship
with v_globalcustid, m, member_node_properties, relationships
UNWIND relationships as r

// Get the tag node and skip the null tag
with m, v_globalcustid, r, r.node.label as node_label, r.node.tag_id as tag_id
where tag_id is not null
CALL apoc.cypher.run(
    "MATCH (t:" + node_label + " {tag_id: '" + tag_id +"'}) RETURN t limit 1", 
    {}
) yield value as tag_node

// Merge the relationship of member node and tag node
with m, v_globalcustid, r, tag_node.t as tag_node
CALL apoc.merge.relationship(
  m, 
  r.relationship.label,
  {createdate: r.relationship.createdate, enddate: r.relationship.enddate, value: r.relationship.value},
  {},
  tag_node,
  {}
) YIELD rel

return *

字符串
cypher在一个小的json文件下工作得很好,然后我发现当json文件太大时,它会出现内存不足的错误,所以我决定把它放在批处理中。

:auto

// Read the json file
CALL apoc.load.json('file:///test/test.json') YIELD value

// Read the json fields as variables
with value.globalcustid as v_globalcustid, value.member_node_properties as member_node_properties, value.relationships as relationships

call {
with v_globalcustid, member_node_properties, relationships

// Update the member node demographic properties, if the member node does not exist, create it
MERGE (m:Member {globalcustid: v_globalcustid})
ON CREATE SET m.age_group=member_node_properties.age_group, m.gender=member_node_properties.gender, m.education=member_node_properties.education
ON MATCH SET m.age_group=member_node_properties.age_group, m.gender=member_node_properties.gender, m.education=member_node_properties.education

// Unwind the relationships array into multiple rows of relationship
with v_globalcustid, m, member_node_properties, relationships
UNWIND relationships as r

// Get the tag node and skip the null tag
with m, v_globalcustid, r, r.node.label as node_label, r.node.tag_id as tag_id
where tag_id is not null
CALL apoc.cypher.run(
    "MATCH (t:" + node_label + " {tag_id: '" + tag_id +"'}) RETURN t limit 1", 
    {}
) yield value as tag_node

// Merge the relationship of member node and tag node
with m, v_globalcustid, r, tag_node.t as tag_node
CALL apoc.merge.relationship(
  m, 
  r.relationship.label,
  {createdate: r.relationship.createdate, enddate: r.relationship.enddate, value: r.relationship.value},
  {},
  tag_node,
  {}
) YIELD rel

return *
} in transactions of 10000 rows


但是看起来在call {}块中调用另一个过程是不起作用的。

Query cannot conclude with CALL together with YIELD (line 31, column 1 (offset: 1319))
"CALL apoc.merge.relationship("
 ^

uklbhaso

uklbhaso1#

你的第二个查询应该失败,因为RETURN *正在投射已经在外部作用域中声明的变量,例如v_globalcustid
要解决你原来的问题,只需返回null

CALL {
  // other stuff
  CALL apoc.merge.relationship(
    m, 
    r.relationship.label,
    { 
      createdate: r.relationship.createdate, 
      enddate: r.relationship.enddate, 
      value: r.relationship.value
    },
    {},
    tag_node,
    {}
  ) YIELD rel
  RETURN null AS discard
} IN TRANSACTIONS OF 10000 ROWS
RETURN null

字符串
这是必要的,因为CALL子查询必须以RETURN或写操作(如CREATE)结束。尽管apoc.merge.relationship确实执行写操作,但Cypher将其视为必须返回值的读操作。

相关问题