我有两个表,它们之间有一个公共键,以及相当多的其他重要信息;为了简单起见,我将使用组合A和组合b。当满足组合时,具有最大记录数的表应该是我收集信息的源;在本例中是id。当计数相同时,优先级为表1。
COMMONKEY列是我的表中的组合/连接条件。
(Table 1)
SELECT '123' table1_id,'Comb A' commonkey from dual UNION
SELECT '124' table1_id,'Comb A' commonkey from dual UNION
SELECT '125' table1_id,'Comb A' commonkey from dual UNION
SELECT '126' table1_id,'Comb A' commonkey from dual UNION
SELECT '215' table1_id,'Comb B' commonkey from dual UNION
SELECT '216' table1_id,'Comb B' commonkey from dual UNION
SELECT '559' table1_id,'Random Combination 1' commonkey from dual UNION
SELECT '560' table1_id,'Random Combination 2' commonkey from dual ;
( Table 2 )
SELECT 'abc1' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc2' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc3' table2_id,'Comb A' commonkey from dual UNION
SELECT 'abc4' table2_id,'Comb A' commonkey from dual UNION
SELECT 'xyz1' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz3' table2_id,'Comb B' commonkey from dual UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual UNION
SELECT '416abc1' table2_id,'Random Combination 91' commonkey from dual UNION
SELECT '416abc2' table2_id,'Random Combination 92' commonkey from dual;
Result Set Expected :
ID COMMONKEY
123 Comb A
124 Comb A
125 Comb A
126 Comb A
xyz1 Comb B
xyz2 Comb B
xyz3 Comb B
559 Random Combination 1
560 Random Combination 1
416abc1 Random Combination 91
416abc2 Random Combination 92
更新图片:
(该图像显示了excel中跟踪数据的屏幕截图;需求和策略是彩色映射的,以使其快速地被理解)
我需要使用SQL生成如下结果集:
当table1.commonkey = table2.commonkey命中时,我需要-
- 如果表1有10个id,表2有5个id ->从表1中选择10个id
- 如果表1有15个id,表2有30个id ->从表2中选择30个id
- 如果表1有4个id,表2也有4个id ->从表1中选择4个id。(当相等时,选择table1 id)
- 当没有与公共键匹配时,防止交叉连接并将行集线性添加到结果表中。
编辑:我最初走的路线与
a left join b where b.key IS null ;
a full outer join b where b.key IS NULL or a.key is NULL ;
来实现A-B或B-A结果集的变通方法,但这两种方法都是非常错误的。收集Delta集或Exclusion集不顺利。
这里有一个选择;参见代码中的注释
SQL> with
2 -- sample data
3 a (id, ckey) as
4 (select '123', 'ca' from dual union all
5 select '124', 'ca' from dual union all
6 select '125', 'ca' from dual union all
7 select '126', 'ca' from dual union all
8 select '215', 'cb' from dual union all
9 select '216', 'cb' from dual union all
10 select '551', 'r1' from dual union all
11 select '552', 'r2' from dual
12 ),
13 b (id, ckey) as
14 (select 'abc1', 'ca' from dual union all
15 select 'abc2', 'ca' from dual union all
16 select 'abc3', 'ca' from dual union all
17 select 'abc4', 'ca' from dual union all
18 select 'xyz1', 'cb' from dual union all
19 select 'xyz2', 'cb' from dual union all
20 select 'xyz3', 'cb' from dual union all
21 select '9991', 'r3' from dual union all
22 select '9992', 'r4' from dual
23 ),
24 -- count rows per each CKEY (common key)
25 tempa as
26 (select id, ckey, count(*) over (partition by ckey) cnt
27 from a
28 ),
29 tempb as
30 (select id, ckey, count(*) over (partition by ckey) cnt
31 from b
32 )
33 -- final query
34 select distinct
35 case when a.cnt >= b.cnt then a.id
36 else b.id
37 end id,
38 a.ckey
39 from tempa a join tempb b on b.ckey = a.ckey
40 union all
41 select ckey, id from a
42 where not exists (select null from b where a.ckey = b.ckey)
43 union all
44 select ckey, id from b
45 where not exists (select null from a where a.ckey = b.ckey)
46 order by 1, 2;
结果是
ID CKEY
---- -----
r1 551
r2 552
r3 9991
r4 9992
xyz1 cb
xyz2 cb
xyz3 cb
123 ca
124 ca
125 ca
126 ca
11 rows selected.
SQL>