基于多个字段的相似数据的不重叠



我想从原始表中创建一个可能重复记录的表,但数据基于两个不同的属性,它们只能加入同一个分组id。此外,在某些情况下,数据看起来并不完全相同(但有相似之处(。以下是原始表格的样子:

group_id| House_num | Apt    | code
----------------------------------
45   | 1000      |  1     |  M
45   |    1      |        |  D
45   | 1000      |  2     |  M
45   |    2      |        |  D
87   | 2300      | 310    |  M
87   | 2310      |        |  D
87   | 2400      | 470    |  M
87   | 2470      |        |  D

我希望返回的是这些相似的数字都在同一行。所以类似于:

new_id  |group_id|a.house_num|a.apt|a.code|b.house_num|b.apt| b.code
-------------------------------------------------------------------------
1    |   45   |   1000   |  1   |  M   |    1      |     |  D
2    |   45   |   1000   |  2   |  M   |    2      |     |  D  
3    |   87   |   2300   | 310  |  M   | 2310      |     |  D
4    |   87   |   2400   | 470  |  M   | 2470      |     |  D

我不确定在这里使用哪种联接;此外,我不知道如何避开a.house_num是基数,a.apt是后缀数,b.housenumber是两者的组合。如有任何帮助,我们将不胜感激,谢谢。

create table t (group_id, House_num , Apt    , code) as
select   45   , 1000      ,  1     ,  'M' from dual union all
select   45   ,    1      , null   ,  'D' from dual union all
select   45   , 1000      ,  2     ,  'M' from dual union all
select   45   ,    2      , null   ,  'D' from dual union all
select   87   , 2300      , 310    ,  'M' from dual union all
select   87   , 2310      , null   ,  'D' from dual union all
select   87   , 2400      , 470    ,  'M' from dual union all
select   87   , 2470      , null   ,  'D' from dual;
select rownum new_id,
a.GROUP_ID, a.HOUSE_NUM ahn, a.APT aapt, a.CODE acode,
b.HOUSE_NUM bhn, b.APT bapt, b.CODE bcode
from t a
join t b
on a.group_id = b.group_id
and a.code = 'M'
and b.code = 'D'
and (
b.house_num = a.apt
or b.house_num like '%'||a.apt
);
NEW_ID GROUP_ID AHN  AAPT ACODE BHN BAPT BCODE
1      45       1000 1    M     1        D
2      45       1000 2    M     2        D
3      87       2300 310  M     2310     D
4      87       2400 470  M     2470     D

您可以使用以下内容来获得答案这里的DB小提琴

WITH data
AS (SELECT * FROM YOUR_TABLE),
data2
AS (SELECT row_number()over 
(order by thing1,thing2)    rw,
d1.group_id,
d1.thing1 a_thing1,
d1.thing2 a_thing2,
d1.thing3 a_thing3
FROM   data d1
WHERE  d1.thing3 = 'M'),
data3
AS (SELECT row_number()over 
(order by thing1,thing2)    rw,
d1.group_id,
d1.thing1 b_thing1,
d1.thing2 b_thing2,
d1.thing3 b_thing3
FROM   data d1
WHERE  d1.thing3 = 'D')
SELECT d2.rw New_id,
d2.group_id,
d2.a_thing1,
d2.a_thing2,
d2.a_thing3,
d3.b_thing1,
d3.b_thing2,
d3.b_thing3
FROM   data2 d2,
data3 d3
WHERE  d2.group_id = d3.group_id
AND d2.rw = d3.rw 

最新更新