首页 > 解决方案 > Is there a way to speed up LOAD CSV of 120M relationships into 10M nodes avoiding cartesian product in Neo4j?

问题描述

I am trying to create 120M relationships between 10M nodes(:Homes). I have already created all the (:Homes) nodes and created an index on (:Homes).id.

CREATE INDEX ON :Homes(id)

This is my code for inserting into the database from a local CSV file. Each row in the CSV file has home1_id and home2_id and I am trying to create a relationship home1 --> home2

USING PERIODIC COMMIT 50000
LOAD CSV WITH HEADERS FROM "file:///relationships.csv" AS row
MATCH (home1:Homes {id: toInteger(row.home1_id)}),(home2:Homes {id: toInteger(row.home2_id)})
CREATE (home1)-[:Recommends]->(home2)

Running this currently seems like it is going to take 1-2 hours. Are there any optimizations I can make?

标签: csvneo4j

解决方案


推荐阅读