Neo4j在插入大数据时引发java.lang.OutOfMemory异常



所有

我的程序在插入大数据时引发java.lang.OutOfMemory异常,不过,我已经使用了一些调优技巧,如更改java_opts和事务批处理提交。我听说JVM会在Neo4J提交事务时减少内存使用,但它似乎不起作用。当它处理7000000行异常时,有什么建议吗?

这是我的Neo4j属性

neostore.propertystore.db.index.keys.mapped_memory=20M
neostore.propertystore.db.index.mapped_memory=20M
neostore.nodestore.db.mapped_memory=400M
neostore.relationshipstore.db.mapped_memory=1000M
neostore.propertystore.db.mapped_memory=400M
neostore.propertystore.db.strings.mapped_memory=400M

这是我的JVM OPTS

java -jar -server -Xmx2G -XX:+UseConcMarkSweepGC neodataio.jar $@

这是我的代码

public Node createNode(String type, String v) {
stype = type;
UniqueFactory.UniqueNodeFactory factory = new UniqueFactory.UniqueNodeFactory(
db, type) {
@Override
protected void initialize(Node created,
Map<String, Object> properties) {
created.addLabel(DynamicLabel.label(stype));
created.setProperty("v", properties.get(stype));
}
};
return factory.getOrCreate(type, v);
}
private void processLine(String line) {
line = stripeStr(line);
String[] fields = line.split("["+splitor+"]");
List<Node> row = new ArrayList<Node>();
Map<String,Boolean> unqi = new HashMap<String,Boolean>();
for (String field : fields) {
String[] kvs = field.split("["+kv+"]");
if(kvs.length==2
&&!unqi.containsKey(kvs[1])
&&!stripeStr(kvs[1]).equals("")
&&!stripeStr(kvs[1]).toLowerCase().equals("null")){
Node n = createNode(stripeStr(kvs[0]), stripeStr(kvs[1]));
row.add(n);
unqi.put(kvs[1], true);
}
}
if (row.size() > 1) {
for (int i = 1; i < row.size(); i++) {
row.get(0).createRelationshipTo(row.get(i), Importer.connect);
}
}
}
private void processBatch(ArrayList<String> batch){
Transaction tx = db.beginTx();
try {
for(String line : batch) {        
processLine(line);        
}    
tx.success();
} finally {
tx.close();
}
}
private String stripeStr(String str){
return str.trim().replace("n", "").replace("t", "");
}
public void processFile(String filepth) throws IOException {
long begin = new Date().getTime();
File f = new File(filepth);
FileInputStream fi = new FileInputStream(f);
BufferedReader dr=new BufferedReader(new InputStreamReader(fi)); 
String line;
int i = 1;
ArrayList<String> batch = new ArrayList<String>();
while((line=dr.readLine())!=null){
batch.add(line);
if(i%batchsize == 0){
processBatch(batch);
batch = new ArrayList<String>();
System.out.println(i);
}
i++;
}
processBatch(batch);
System.out.println(i);
long end = new Date().getTime();
System.out.println("cost time:"+(end-begin));
}

异常

Exception in thread "GC-Monitor" java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOfRange(Arrays.java:2694)
at java.lang.String.<init>(String.java:203)
at java.lang.StringBuilder.toString(StringBuilder.java:405)
at org.neo4j.kernel.impl.cache.MeasureDoNothing.run(MeasureDoNothing.java:84)
Exception in thread "main" org.neo4j.graphdb.TransactionFailureException: Unable to commit transaction
at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:140)
at com.bfd.finance.neo4j.dataio.Importer.processBatch(Importer.java:79)
at com.bfd.finance.neo4j.dataio.Importer.processFile(Importer.java:98)
at com.bfd.finance.neo4j.dataio.Importer.main(Importer.java:161)
Caused by: org.neo4j.graphdb.TransactionFailureException: commit threw exception
at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:498)
at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:397)
at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:122)
at org.neo4j.kernel.TopLevelTransaction.close(TopLevelTransaction.java:124)
... 3 more
Caused by: javax.transaction.xa.XAException
at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:553)
at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:460)
... 6 more
Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.createEntry(HashMap.java:901)
at java.util.HashMap.putForCreate(HashMap.java:554)
at java.util.HashMap.putAllForCreate(HashMap.java:559)
at java.util.HashMap.<init>(HashMap.java:298)
at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.applyCommit(WriteTransaction.java:817)
at org.neo4j.kernel.impl.nioneo.xa.WriteTransaction.doCommit(WriteTransaction.java:751)
at org.neo4j.kernel.impl.transaction.xaframework.XaTransaction.commit(XaTransaction.java:322)
at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commitWriteTx(XaResourceManager.java:530)
at org.neo4j.kernel.impl.transaction.xaframework.XaResourceManager.commit(XaResourceManager.java:446)
at org.neo4j.kernel.impl.transaction.xaframework.XaResourceHelpImpl.commit(XaResourceHelpImpl.java:64)
at org.neo4j.kernel.impl.transaction.TransactionImpl.doCommit(TransactionImpl.java:545)
... 7 more

我们所做的是每5000个节点提交一次事务,这非常有效。明显的缺点是,当节点5001出现问题时,无法回滚前5000个节点。

至于批次插入器。如果您使用程序导入一次性数据,而不需要数据库可用于其他请求,则可以使用该程序。对于所有其他大型导入用例,batchinserter将不会对您有所帮助。

相关内容

最新更新