Titan + DynamoDB Local 非常慢(30 个顶点的 8s 提交)



我正在开发一个使用 Titan + DynamoDB Local 作为存储后端的应用程序。以前我使用伯克利作为存储后端,它具有出色的性能。自从切换以来,我什至无法使用该应用程序,因为一切都需要很长时间。

一个简单的测试用例添加 30 个顶点并提交更改;这就是我得到的:

//Berkeley Backend
Committing took 0.092 seconds.
//DynamoDB Local Backend    
Committing took 8.684 seconds.

后续提交需要同样长的时间。我知道dynamodb不会像Berkeley那么快,因为它只是一个用于开发的包装SQL数据库,但是它必须比这更快。我只能猜测我的配置有问题,另一方面,我的配置直接从他们的存储库中获取。在内存中运行 dynamo 只会略微提高性能。

我已经尝试增加容量读/写单位,但没有运气。有没有人在本地运行中等速度的dynamo版本?

这是我当前的配置:

#general Titan configuration
gremlin.graph=com.thinkaurelius.titan.core.TitanFactory
storage.setup-wait=60000
storage.buffer-size=1024
# Metrics configuration - http://s3.thinkaurelius.com/docs/titan/1.0.0/titan-config-ref.html#_metrics
#metrics.enabled=true
#metrics.prefix=t
# Required; specify logging interval in milliseconds
#metrics.csv.interval=500
#metrics.csv.directory=metrics
# Turn off titan retries as we batch and have our own exponential backoff strategy.
storage.write-time=1 ms
storage.read-time=1 ms
storage.backend=com.amazon.titan.diskstorage.dynamodb.DynamoDBStoreManager
#Amazon DynamoDB Storage Backend for Titan configuration
storage.dynamodb.force-consistent-read=true
# should be the graph name rexster/graphs/graph/graph-name
storage.dynamodb.prefix=v100
storage.dynamodb.metrics-prefix=d
storage.dynamodb.enable-parallel-scans=false
storage.dynamodb.max-self-throttled-retries=60
storage.dynamodb.control-plane-rate=10
# DynamoDB client configuration: credentials
storage.dynamodb.client.credentials.class-name=com.amazonaws.auth.BasicAWSCredentials
storage.dynamodb.client.credentials.constructor-args=access,secret
# DynamoDB client configuration: endpoint (Below, set to DynamoDB Local as invoked by mvn test -Pstart-dynamodb-local).
# You can change the endpoint to point to Production DynamoDB regions.)
storage.dynamodb.client.endpoint=http://localhost:1234
# max http connections - not recommended to use more than 250 connections in DynamoDB Local
storage.dynamodb.client.connection-max=250
# turn off sdk retries
storage.dynamodb.client.retry-error-max=0
# DynamoDB client configuration: thread pool
storage.dynamodb.client.executor.core-pool-size=25
# Do not need more threads in thread pool than the number of http connections
storage.dynamodb.client.executor.max-pool-size=250
storage.dynamodb.client.executor.keep-alive=60000
storage.dynamodb.client.executor.max-concurrent-operations=1
# should be at least as large as the storage.buffer-size
storage.dynamodb.client.executor.max-queue-length=1024
# 750 r/w CU result in provisioning the maximum equal numbers read and write Capacity Units that can
# be set on one table before it is split into two or more partitions for IOPS. If you will have more than one Rexster server
# accessing the same graph, you should set the read-rate and write-rate properties to values commensurately lower than the
# read and write capacity of the backend tables.
storage.dynamodb.stores.edgestore.capacity-read=100
storage.dynamodb.stores.edgestore.capacity-write=100
storage.dynamodb.stores.edgestore.read-rate=100
storage.dynamodb.stores.edgestore.write-rate=100
storage.dynamodb.stores.edgestore.scan-limit=10000
storage.dynamodb.stores.graphindex.capacity-read=100
storage.dynamodb.stores.graphindex.capacity-write=100
storage.dynamodb.stores.graphindex.read-rate=100
storage.dynamodb.stores.graphindex.write-rate=100
storage.dynamodb.stores.graphindex.scan-limit=10000
storage.dynamodb.stores.systemlog.capacity-read=10
storage.dynamodb.stores.systemlog.capacity-write=10
storage.dynamodb.stores.systemlog.read-rate=10
storage.dynamodb.stores.systemlog.write-rate=10
storage.dynamodb.stores.systemlog.scan-limit=10000
storage.dynamodb.stores.titan_ids.capacity-read=10
storage.dynamodb.stores.titan_ids.capacity-write=10
storage.dynamodb.stores.titan_ids.read-rate=10
storage.dynamodb.stores.titan_ids.write-rate=10
storage.dynamodb.stores.titan_ids.scan-limit=10000
storage.dynamodb.stores.system_properties.capacity-read=10
storage.dynamodb.stores.system_properties.capacity-write=10
storage.dynamodb.stores.system_properties.read-rate=10
storage.dynamodb.stores.system_properties.write-rate=10
storage.dynamodb.stores.system_properties.scan-limit=10000
storage.dynamodb.stores.txlog.capacity-read=10
storage.dynamodb.stores.txlog.capacity-write=10
storage.dynamodb.stores.txlog.read-rate=10
storage.dynamodb.stores.txlog.write-rate=10
storage.dynamodb.stores.txlog.scan-limit=10000
# elasticsearch config that is required to run GraphOfTheGods
index.search.backend=elasticsearch
index.search.directory=/tmp/searchindex
index.search.elasticsearch.client-only=false
index.search.elasticsearch.local-mode=true
index.search.elasticsearch.interface=NODE

测试代码:

private long start;
@Test
public void performanceTest() {
Graph graph = GraphFactory.getDefault();
for (int i=1; i<=30; i++) {
graph.addVertex("myLabel");
}
startStopwatch();
graph.tx().commit();
stopStopwatch("Committing");
}
private void startStopwatch() {
start = System.currentTimeMillis();
}
private void stopStopwatch(String opName) {
System.out.println(opName + " took " + (System.currentTimeMillis() - start) / 1000.0 + " seconds.");
}

版本

  • 泰坦 1.0.0
  • DynamoDB 存储后端 1.0.2
  • DynamoDB 本地版 1.11.86
  • 伯克利 1.0.0

相关

  • DynamoDB 本地糟糕的性能
  • 使用 DynamoDB 本地数据库的写入性能很差

我在 GitHub 上创建了一个问题来跟踪您感知到的延迟的重现。当我运行您的代码时,我在命令行上只测量了 84 毫秒的提交时间,即使我包含所有设置和静态 Java 初始化代码,端到端测试也只需要 5 秒。请尝试主分支。

final Graph graph = JanusGraphFactory.open(TestGraphUtil.instance.createTestGraphConfig(BackendDataModel.MULTI));
IntStream.of(30).forEach(i -> graph.addVertex(LABEL));
Stopwatch watch = Stopwatch.createStarted();
graph.tx().commit();
System.out.println("Committing took " + watch.stop().elapsed(TimeUnit.MILLISECONDS) + " ms");
TestGraphUtil.instance.cleanUpTables();

DynamoDB Local 是一种测试工具,您不应期望看到与使用 DynamoDB 服务时相同的低延迟。话虽如此,当您使用 inMemory 选项时,您将获得 DynamoDB 本地的最佳性能。

最新更新