我目前正在评估Kinetica DB在一个3节点集群上的分析项目,以在其上构建一个自定义仪表板。
据我所知,有一种直接从数据节点检索记录的方法,称为"多头查找"。
当使用Java API时,它似乎只连接和请求来自第一个节点的数据(也与Wireshark检查)。知道怎么了吗?
表是在vendor_id和rate_code_id上分片的,当在Admin-UI中按下"dist"按钮时,它显示一切都是正确分布的。
我的DDL:
CREATE TABLE "demo"."nyctaxi_sharded"
(
"vendor_id" VARCHAR (4, shard_key) NOT NULL,
"pickup_datetime" TIMESTAMP NOT NULL,
"dropoff_datetime" TIMESTAMP NOT NULL,
"passenger_count" TINYINT NOT NULL,
"trip_distance" REAL NOT NULL,
"pickup_longitude" REAL NOT NULL,
"pickup_latitude" REAL NOT NULL,
"rate_code_id" SMALLINT (shard_key) NOT NULL,
"store_and_fwd_flag" VARCHAR (1) NOT NULL,
"dropoff_longitude" REAL NOT NULL,
"dropoff_latitude" REAL NOT NULL,
"payment_type" VARCHAR (16) NOT NULL,
"fare_amount" REAL NOT NULL,
"surcharge" REAL NOT NULL,
"mta_tax" REAL NOT NULL,
"tip_amount" REAL NOT NULL,
"tolls_amount" REAL NOT NULL,
"total_amount" REAL NOT NULL,
"cab_type" TINYINT NOT NULL
)
TIER STRATEGY (
( ( VRAM 1, RAM 5, PERSIST 5 ) )
);
这是Java代码的片段:
GPUdb gpudb = new GPUdb("http://myserver:9191", new GPUdbBase.Options().setUsername("user").setPassword("pw"));
Type tableType = Type.fromTable(gpudb, "demo.nyctaxi_sharded");
RecordRetriever<GenericRecord> recordRetriever = new RecordRetriever<>(gpudb, "demo.nyctaxi_sharded", tableType);
List<List<Object>> lookupValues = Arrays.asList(
Arrays.asList("VENDOR1", 0),
Arrays.asList("VENDOR2", 0),
Arrays.asList("VENDOR3", 0),
Arrays.asList("VENDOR1", 1),
Arrays.asList("VENDOR2", 1),
Arrays.asList("VENDOR3", 1),
Arrays.asList("VENDOR1", 2),
Arrays.asList("VENDOR2", 2),
Arrays.asList("VENDOR3", 2));
lookupValues.forEach(lookup -> {
try {
GetRecordsResponse<GenericRecord> record = recordRetriever.getByKey(lookup, "");
System.out.println(record);
} catch (GPUdbException e) {
e.printStackTrace();
}
});
一旦这是工作的,我想在一个预先计算的物化视图上使用这个查找。这可能吗?
确认gpudb.conf文件(在服务器上,通常在/opt/gpudb/core/etc中)有:
enable_worker_http_servers = true
可以是"false"默认情况下,if "false"那么Java的recordretriiever将只使用一个节点,这就是你所描述的你所看到的。