我正在对点燃2.10.0 进行以下测试
我创建了两个表,query_paralllelism=1,但没有关联键。当我加入下面的两个表时,我得到了预期的结果。
0: jdbc:ignite:thin://localhost:10800> SELECT "id" AS "_A_id", "source_id" AS "_A_source_id" FROM PUBLIC."source_ml_blue";
+--------------------------------------+--------------------------------------+
| _A_id | _A_source_id |
+--------------------------------------+--------------------------------------+
| 86c068cd-da89-11eb-a185-3da86c6c6bb3 | 86c068cc-da89-11eb-a185-3da86c6c6bb3 |
+--------------------------------------+--------------------------------------+
1 row selected (0.004 seconds)
0: jdbc:ignite:thin://localhost:10800> SELECT "id" AS "_B_id", "flx_src_ip_text" AS "_B_src_ip" FROM PUBLIC."source_nprobe_tcp_blue";
+--------------------------------------+-----------+
| _B_id | _B_src_ip |
+--------------------------------------+-----------+
| 86c068cc-da89-11eb-a185-3da86c6c6bb3 | 1.1.1.1 |
+--------------------------------------+-----------+
1 row selected (0.003 seconds)
0: jdbc:ignite:thin://localhost:10800> SELECT _A."id" AS "_A_id", _A."source_id" AS "_A_source_id", _B."id" AS "_B_id", _B."flx_src_ip_text" AS "_B_src_ip" FROM PUBLIC."source_ml_blue" AS "_A" INNER JOIN PUBLIC."source_nprobe_tcp_blue" AS "_B" ON "_A"."source_id"="_B"."id";
+--------------------------------------+--------------------------------------+--------------------------------------+-----------+
| _A_id | _A_source_id | _B_id | _B_src_ip |
+--------------------------------------+--------------------------------------+--------------------------------------+-----------+
| 86c068cd-da89-11eb-a185-3da86c6c6bb3 | 86c068cc-da89-11eb-a185-3da86c6c6bb3 | 86c068cc-da89-11eb-a185-3da86c6c6bb3 | 1.1.1.1 |
+--------------------------------------+--------------------------------------+--------------------------------------+-----------+
1 row selected (0.005 seconds)
如果删除并创建query_paralllelism=8的相同表,则不会出现SQL错误(两个表的并行度相等(,但联接的结果为空。
知道我为什么会有这种行为吗?
您观察到这种行为是因为对并行查询执行进行了优化。很可能您的记录被放到不同的分区(由不同的线程处理(。如果增加两个表中的记录数,则会看到该联接的一个子集。这里最优雅的选项是让"_A"."source_id"
和"_B"."id"
成为亲和键。ignite.jdbc.distributedJoins
很可能会影响集群安装的性能。亲和性搭配将使具有匹配"_A"."source_id"
和"_B"."id"
的项目位于同一分区中,以避免跨分区交互(对于集群环境,这将导致额外的网络跳数(。
问题来自SQL客户端:它必须注意并行性。
在DBeaver上,我必须在连接属性中启用ignite.jdbc.distributedJoins,以使请求正常工作。