当我尝试从mysql加载大量数据时,我使用cassandra后端和elasticsearch将每个记录提交到janusgraph,使用8个线程建立索引;
开始时,程序将以每秒280条记录的速度加载;
但当它处理秒数时,它会下降到1~10个记录/秒;
我尝试修改了缓冲区大小、页面大小、块大小、更新百分比等配置,但没有明显改善;
我只是想知道我是否错过了什么,是什么导致了这种情况。。。
下面的代码是我的提交过程,datamap是一个fastjson对象,g是janusgraph遍历源代码;
Long countryId = dataMap.getLong("countryId");
Long uid = dataMap.getLong("uid");
String phoneNum = dataMap.getString("phoneNumber");
String fbId = dataMap.getString("fbId");
Long createTime = dataMap.getLong("createTime");
if (uid == null) {
return;
}
Vertex uidVertex = g.addV("uid").next();
uidVertex.property("uid_code", uid);
if (createTime != null)
uidVertex.property("create_time", createTime);
if (status != null)
uidVertex.property("status", status);
g.tx().commit();
if (phoneNum != null) {
Vertex phoneVertex = KfkMsgParser.createMerge(g, "phone", "phone_num", phoneNum);
Edge selfPhone = uidVertex.addEdge("user_phone", phoneVertex);
selfPhone.property("create_time", bind.of("create_time", dataMap.getLong("createTime")));
selfPhone.property("uid_code", bind.of("uid_code", uid));
selfPhone.property("phone_num", bind.of("phone_num", phoneNum));
g.tx().commit();
}
if(fbId != null){
long endTamp2 = System.currentTimeMillis();
Vertex fbVertext = KfkMsgParser.createMerge(g, "fb_id", "fb_account",fbId);
Edge selfFb = uidVertex.addEdge("user_fb",fbVertext);
if (createTime != null)
selfFb.property("create_time",bind.of("create_time",createTime));
g.tx().commit();
}
以下是createmerge函数:
private static Vertex createMerge(GraphTraversalSource g, String label, String propertyKey, Object propertyValue) {
Optional<Vertex> vertexOptional = g.V().hasLabel(label).has(propertyKey, propertyValue).tryNext();
if (vertexOptional.isPresent()) {
return vertexOptional.get();
}
Vertex vertex = g.addV(label).next();
vertex.property(propertyKey, propertyValue);
return vertex;
}
1条答案
按热度按时间ercv8c1e1#
我建立索引的时候出错了。
我在谷歌集团发现这样一个主题:https://groups.google.com/forum/#!msg/janusgraph用户/vpiudlc4wno/kihm-s2aawaj
并知道每秒可获得2000~3000条记录。