根据连接后二级列的最大计数生成结果



我有两个表,它们之间有一个公共键,以及相当多的其他重要信息;为了简单起见,我将使用组合A和组合b。当满足组合时,具有最大记录数的表应该是我收集信息的源;在本例中是id。当计数相同时,优先级为表1。

COMMONKEY列是我的表中的组合/连接条件。

(Table 1)
SELECT '123' table1_id,'Comb A' commonkey from dual UNION
SELECT '124' table1_id,'Comb A' commonkey from dual UNION
SELECT '125' table1_id,'Comb A' commonkey from dual UNION
SELECT '126' table1_id,'Comb A' commonkey from dual UNION
SELECT '215' table1_id,'Comb B' commonkey from dual UNION
SELECT '216' table1_id,'Comb B' commonkey from dual UNION
SELECT '559' table1_id,'Random Combination 1' commonkey from dual UNION
SELECT '560' table1_id,'Random Combination 2' commonkey from dual ;   

( Table 2 )     

SELECT 'abc1' table2_id,'Comb A' commonkey from dual  UNION
SELECT 'abc2' table2_id,'Comb A' commonkey from dual  UNION
SELECT 'abc3' table2_id,'Comb A' commonkey from dual  UNION
SELECT 'abc4' table2_id,'Comb A' commonkey from dual  UNION
SELECT 'xyz1' table2_id,'Comb B' commonkey from dual  UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual  UNION
SELECT 'xyz3' table2_id,'Comb B' commonkey from dual  UNION
SELECT 'xyz2' table2_id,'Comb B' commonkey from dual  UNION 
SELECT '416abc1' table2_id,'Random Combination 91' commonkey from dual UNION
SELECT '416abc2' table2_id,'Random Combination 92' commonkey from dual;



Result Set Expected :
ID        COMMONKEY         
123       Comb A            
124       Comb A            
125       Comb A            
126       Comb A            
xyz1      Comb B            
xyz2      Comb B            
xyz3      Comb B            
559       Random Combination 1          
560       Random Combination 1          
416abc1   Random Combination 91         
416abc2   Random Combination 92 

更新图片:

(该图像显示了excel中跟踪数据的屏幕截图;需求和策略是彩色映射的,以使其快速地被理解)

我需要使用SQL生成如下结果集:

当table1.commonkey = table2.commonkey命中时,我需要-

  • 如果表1有10个id,表2有5个id ->从表1中选择10个id
  • 如果表1有15个id,表2有30个id ->从表2中选择30个id
  • 如果表1有4个id,表2也有4个id ->从表1中选择4个id。(当相等时,选择table1 id)
  • 当没有与公共键匹配时,防止交叉连接并将行集线性添加到结果表中。

编辑:我最初走的路线与

a left join b where b.key IS null ;
a full outer join b where b.key IS NULL or a.key is NULL ;

来实现A-B或B-A结果集的变通方法,但这两种方法都是非常错误的。收集Delta集或Exclusion集不顺利。

这里有一个选择;参见代码中的注释

SQL> with
2  -- sample data
3  a (id, ckey) as
4    (select '123', 'ca' from dual union all
5     select '124', 'ca' from dual union all
6     select '125', 'ca' from dual union all
7     select '126', 'ca' from dual union all
8     select '215', 'cb' from dual union all
9     select '216', 'cb' from dual union all
10     select '551', 'r1' from dual union all
11     select '552', 'r2' from dual
12    ),
13  b (id, ckey) as
14    (select 'abc1', 'ca' from dual union all
15     select 'abc2', 'ca' from dual union all
16     select 'abc3', 'ca' from dual union all
17     select 'abc4', 'ca' from dual union all
18     select 'xyz1', 'cb' from dual union all
19     select 'xyz2', 'cb' from dual union all
20     select 'xyz3', 'cb' from dual union all
21     select '9991', 'r3' from dual union all
22     select '9992', 'r4' from dual
23    ),

24  -- count rows per each CKEY (common key)
25  tempa as
26    (select id, ckey, count(*) over (partition by ckey) cnt
27     from a
28    ),
29  tempb as
30    (select id, ckey, count(*) over (partition by ckey) cnt
31     from b
32    )
33  -- final query
34  select distinct
35    case when a.cnt >= b.cnt then a.id
36         else b.id
37    end id,
38    a.ckey
39  from tempa a join tempb b on b.ckey = a.ckey
40  union all
41  select ckey, id from a
42    where not exists (select null from b where a.ckey = b.ckey)
43  union all
44  select ckey, id from b
45    where not exists (select null from a where a.ckey = b.ckey)
46  order by 1, 2;

结果是

ID   CKEY
---- -----
r1   551
r2   552
r3   9991
r4   9992
xyz1 cb
xyz2 cb
xyz3 cb
123  ca
124  ca
125  ca
126  ca
11 rows selected.
SQL>

最新更新