有条件加入 - COALESCE vs OR - Oracle SQL



我正在使用如下所示的数据集(由于它太小而没有创建小提琴)

我有一个表tblReqs,其基本结构如下:

| Onum | Pnum | ReqNum |
|:----:|:----:|:------:|
| NULL | P427 | RN1148 |
| NULL | P324 | RN1725 |
| NULL | P229 | RN1242 |
| O396 | NULL | RN1457 |
| O380 | NULL | RN1205 |
| O258 | NULL | RN1482 |

然后我有一个直数表,称为tblnums,如下所示:

| nums |
|------|
| O258 |
| O370 |
| O490 |
| O314 |
| O379 |
| P341 |
| P230 |
| P280 |
| P324 |
| P395 |

我需要有条件地加入tblnumstblReqs.tblReqs中的OnumPnum等于tblnums中的nums字段。

目标将是一个如下所示的数据集:

| nums | ReqNum |
|:----:|:------:|
| O258 | RN1482 |
| O370 |        |
| O490 |        |
| O314 |        |
| O379 |        |
| P341 |        |
| P230 |        |
| P280 |        |
| P324 | RN1725 |
| P395 |        |

我知道我可以在连接中使用OR运算符,如下所示:

SELECT
tblnums.nums,
tblReqs.ReqNum
FROM
tblnums
LEFT JOIN tblReqs ON tblnums.nums = tblReqs.Onum OR tblnums.nums = tblReqs.Pnum

但我最近了解到COALESCE

SELECT
tblnums.nums,
tblReqs.ReqNum
FROM
tblnums
LEFT JOIN tblReqs ON tblnums.nums = COALESCE(tblReqs.Onum, tblReqs.Pnum)

是否有偏爱一个而不是另一个?我知道我可以在我的数据集上尝试两者,但测量时间不是一个很好的指标,因为它受到许多其他限制(数据库维护、网络带宽、CPU/RAM 功率等)的影响。此外,我无法通过 SQL 开发人员提取许多指标,因为我的 IT 团队已将其锁定。

条件连接首选这两种方法中的哪一种,ORCOALESCE?从运行时复杂性的角度来看,发生了什么,等等?

使用 coalesce(或 NVL)有一个好处,因为您可以利用基于函数的索引来提高查询性能。

设置:

FSITJA@db01 2019-07-18 09:21:33> create table tblreqs (onum, pnum, reqnum primary key) as
2  with t (onum, pnum, reqnum) as (
3  select NULL, 'P427', 'RN1148' from dual union all
4  select NULL, 'P324', 'RN1725' from dual union all
5  select NULL, 'P229', 'RN1242' from dual union all
6  select 'O396', NULL, 'RN1457' from dual union all
7  select 'O380', NULL, 'RN1205' from dual union all
8  select 'O258', NULL, 'RN1482' from dual
9  ) select * from t;
Table created.
FSITJA@db01 2019-07-18 09:21:33> create table tblnums (nums primary key) as
2  with t (nums)as (
3  select 'O258' from dual union all
4  select 'O370' from dual union all
5  select 'O490' from dual union all
6  select 'O314' from dual union all
7  select 'O379' from dual union all
8  select 'P341' from dual union all
9  select 'P230' from dual union all
10  select 'P280' from dual union all
11  select 'P324' from dual union all
12  select 'P395' from dual
13  ) select * from t;
Table created.

请注意,在下面的选项中,由于条件连接(在TBLREQS上,表访问存储已满),oracle无法利用索引:

FSITJA@db01 2019-07-18 09:21:33> explain plan for
2  SELECT n.nums,
3         r.ReqNum
4    FROM tblnums n
5    LEFT JOIN tblReqs r ON n.nums = r.Onum OR n.nums = r.pnum;
Explained.
FSITJA@db01 2019-07-18 09:21:33> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 1571794044
-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |    10 |   130 |    31   (0)| 00:00:01 |
|   1 |  NESTED LOOPS OUTER         |                 |    10 |   130 |    31   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN           | SYS_C0047401    |    10 |    50 |     1   (0)| 00:00:01 |
|   3 |   VIEW                      | VW_LAT_EB747914 |     1 |     8 |     3   (0)| 00:00:01 |
|*  4 |    TABLE ACCESS STORAGE FULL| TBLREQS         |     1 |    13 |     3   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - filter("R"."ONUM" IS NOT NULL AND "N"."NUMS"="R"."ONUM" OR "R"."PNUM" IS NOT
NULL AND "N"."NUMS"="R"."PNUM")
17 rows selected.

现在,如果我们使用 coalesce 函数创建一个基于函数的索引,oracle 会检测到该索引存在,并将使用它来提高连接性能,而无需进行全表扫描(在新创建的IDX_COALESCE_ONUM_PNUM上进行索引范围扫描):

FSITJA@db01 2019-07-18 09:21:33> create index idx_coalesce_onum_pnum on tblreqs (coalesce(Onum, Pnum));
Index created.
FSITJA@db01 2019-07-18 09:21:33> explain plan for
2  SELECT n.nums,
3         r.ReqNum
4    FROM tblnums n
5    LEFT JOIN tblReqs r ON n.nums = COALESCE(r.Onum, r.Pnum);
Explained.
FSITJA@db01 2019-07-18 09:21:33> select * from table(dbms_xplan.display);
PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------
Plan hash value: 605824869
---------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name                   | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                     |                        |    10 |   160 |     2   (0)| 00:00:01 |
|   1 |  NESTED LOOPS OUTER                  |                        |    10 |   160 |     2   (0)| 00:00:01 |
|   2 |   INDEX FULL SCAN                    | SYS_C0047401           |    10 |    50 |     1   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID BATCHED| TBLREQS                |     1 |    11 |     1   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN                  | IDX_COALESCE_ONUM_PNUM |     1 |       |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("N"."NUMS"="R"."SYS_NC00004$"(+))
16 rows selected.

通常,函数和or会阻碍优化。两者应具有非常相似的性能。

我认为最好的方法可能是两个连接:

SELECT n.nums,
COALESCE(ro.ReqNum, rp.ReqNum) as reqNum
FROM tblnums n LEFT JOIN
tblReqs ro
ON n.nums = ro.Onum LEFT JOIN
tblReqs rp
ON n.nums = rp.Pnum AND
ro.Onum IS NULL;  -- no match the first time

最新更新