如何连接具有多个条件的表?



我是新制作复杂(让我知道如果这是不复杂的)SQL查询,我正在努力创建一个查询,将与多个不同的组合加入2个表。以下是我对第一个组合的查询。

SELECT DISTINCT
A.Id, A.Ref1, A.Ref2, A.Ref3, B.Ref1, B.Ref2, B.Ref3
FROM
A
JOIN
B ON TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1)) 
AND TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2)) 
AND TRIM(UPPER(B.Ref3)) = TRIM(UPPER(A.Ref3))

Ref1 - Ref3是第一个组合中必需的列,并且被归类为"精确匹配"。

但是,如果碰巧只有Ref1和Ref2匹配,则将其分类为"部分匹配";代替。下面是一个可能的组合列表:

  • 。Ref1 = B.Ref1和A.Ref2 = B.Ref2和A.Ref3 = B.Ref3 =精确匹配
  • 。Ref1 = B.Ref1和A.Ref2 = B.Ref2 =部分匹配1
  • 。Ref2 = B.Ref2和A.Ref3 = B.Ref3 =部分匹配1
  • 。Ref1 = B.Ref1和A.Ref3 = B.Ref3 =部分匹配1
  • 。Ref1 = B.Ref1 =部分匹配2
  • 。Ref2 = B.Ref2 =部分匹配2
  • 。Ref3 = B.Ref3 =部分匹配2

我最初的尝试看起来已经工作得很好,但最终打破了一旦我添加的值,将使结果集之间的匹配不明显。

无论如何,对于实际目标,我打算在搜索部分匹配1项时不包括精确匹配项,然后对于部分匹配2,部分匹配1和精确匹配不应该包括(不确定我是否在这一点上有足够的意义)。

我最初的尝试一直重复记录,这实际上只是冗余的数据,因为如果它已经完全匹配,记录不应该部分匹配。以下是我得到的结果示例:

Id     A.Ref1     A.Ref2     A.Ref3     B.Ref1     B.Ref2     B.Ref3     Match Type
1     Val1       Val1       Val1       Val1       Val1       Val1       Exact Match
1     Val1       Val1       NULL       Val1       Val1       NULL       Partial Match 1
1     Val1       NULL       Val1       Val1       NULL       Val1       Partial Match 1
1     Val1       NULL       NULL       Val1       NULL       NULL       Partial Match 2

在上面的场景中,由于Id 1已经是一个"精确匹配",我不希望它在"部分匹配"中出现多次。除非记录中有其他不同的字段值。

我试图在一个查询中做到这一切,但我不认为应该有限制,把它们作为单独的。这样做可能会更容易,但请让我知道是否有可能在1分钟内完成,因为这将被执行几次。

在这件事上任何帮助都将是非常感激的。谢谢你。

感谢@Thorsten Kettner下面的回答,我能够使用with方法创建一个我满意的查询。下面的代码是它现在的样子。单个匹配的Partial_Matches2存在一些不可靠性,但目前还不需要考虑这个问题。总的来说,我对我得到的结果很满意。

-- Get exact match
with exact_matches as 
(
Select Distinct
Id, A.Ref1, B.Ref1, A.Ref2, B.Ref2, A.Ref3, B.Ref3, 'Exact Match' as "Match_Type"
from  A
Join B
On  TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1)) And 
TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2)) And
TRIM(UPPER(B.Ref3)) = TRIM(UPPER(A.Ref3))
),
-- Get partial matches 1
partial_matches1 as (
Select 
distinct A.*, 
Case 
When A.Ref1 <> A.Ref1 Or A.Ref1 is NULL
Then 'No match by Ref 1'
When A.Ref2 <> A.Ref2 Or A.Ref2 is NULL
Then 'No match by Ref 2'
When A.Ref3 <> A.Ref3 Or A.Ref3 is NULL
Then 'No match by Ref 3'
End as "Match_Type"
from (
Select Distinct
Id, A.Ref1, B.Ref1, A.Ref2, B.Ref2, A.Ref3, B.Ref3
from A
Join B
On  TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1)) And 
TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2)) 
Union
Select Distinct
Id, A.Ref1, B.Ref1, A.Ref2, B.Ref2, A.Ref3, B.Ref3
from A
Join B
On  TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1)) And 
TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2))
Union
Select Distinct
Id, A.Ref1, B.Ref1, A.Ref2, B.Ref2, A.Ref3, B.Ref3
from A
Join B
On  TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1)) And
TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2))
) As A
Left Join exact_matches As B
On B.Id = A.Id
Where B.Id is NULL
),
-- Get partial matches 2
partial_matches2 as
(
Select 
distinct A.*, 
Case 
When A.Ref1 = A.Ref1
Then 'Match by Ref1'
When A.Ref2 = A.Ref2
Then 'Match by Ref2'
When A.Ref3 = A.Ref3
Then 'Match by Ref3'
End as "Match_Type"
from (
Select Distinct
Id, A.Ref1, B.Ref1, A.Ref2, B.Ref2, A.Ref3, B.Ref3
from A
Join B
On  TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1))
Union
Select Distinct
Id, A.Ref1, B.Ref1, A.Ref2, B.Ref2, A.Ref3, B.Ref3
from A
Join B
On  TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2))
Union
Select Distinct
Id, A.Ref1, B.Ref1, A.Ref2, B.Ref2, A.Ref3, B.Ref3
from A
Join B
On  TRIM(UPPER(B.Ref3)) = TRIM(UPPER(A.Ref3))
) As A
Left Join exact_matches As B
On B.Id = A.Id
Left Join partial_matches1 As C
On C.Id = A.Id
Where B.Id is NULL and C.Id is NULL
)
-- Main Query
select 
Distinct
A.Id, B.Ref1, B.Ref2, B.Ref3,
(
Case
When B.Match_Type Is Not NULL
Then B.Match_Type
Else
'No Match'
End
) As 'Match_Type'
from  A
/* Left Join on Matched Entries */
Left Join (
select * from exact_matches
Union
select * from partial_matches1
Union
select * from partial_matches2
) B
On B.Id = A.Id
Order By A.Id

可以交叉连接两个表,然后计算匹配的列数:

select
id, a_ref1, a_ref2, a_ref3, b_ref1, b_ref2, b_ref3,
case match_count 
when 3 then 'Exact Match'
when 2 then 'Partial Match 1'
when 1 then 'Partial Match 2'
else 'No Match'
end as match_type
from
(
select 
a.id, 
a.ref1 as a_ref1, a.ref2 as a_ref2, a.ref3 as a_ref3,
b.ref1 as b_ref1, b.ref2 as b_ref2, b.ref3 as b_ref3,
case when trim(upper(b.ref1)) = trim(upper(a.ref1)) then 1 else 0 end +
case when trim(upper(b.ref2)) = trim(upper(a.ref2)) then 1 else 0 end +
case when trim(upper(b.ref3)) = trim(upper(a.ref3)) then 1 else 0 end
as match_count
from a cross join b
) match_counted
-- where match_count = 3 /* only Exact Matches */
-- where match_count = 2 /* only Partial Matches Type 1 */
-- where match_count = 1 /* only Partial Matches Type 2 */
-- where match_count = 0 /* only Non-Matches */
;

您可以使用任何建议的WHERE子句,以便只获得特定的匹配类型。如果您想选择几个,也可以选择一个变体,例如where match_count >0where match_count in (1,2)

如果你想要排除非匹配,你可以用on trim(upper(b.ref1)) = trim(upper(a.ref1)) or trim(upper(b.ref2)) = trim(upper(a.ref2)) or trim(upper(b.ref3)) = trim(upper(a.ref3))代替内部连接表。这可能使查询更快(但不能保证这样做)。但是,这会使查询更容易出错,并稍微降低其可维护性,因为您需要两次声明条件。

更新结果是你在正确描述任务时出现了问题。事实证明,The Impaler是一种读心术:-)你想要的是将最佳匹配类型连接到a行。如果对于A行我们找到了精确匹配在B中(所有三个条件都匹配),然后我们连接该行。如果不匹配,则查找"部分匹配1"。(即只有两个条件匹配的行)并连接这些行。如果没有这样的行,然后加入"部分匹配2";

我在这里要做的是从相同的连接开始,然后对连接的行进行排名,然后保留排名最好的行。

with match_counted as
(
select 
a.id, 
a.ref1 as a_ref1, a.ref2 as a_ref2, a.ref3 as a_ref3,
b.ref1 as b_ref1, b.ref2 as b_ref2, b.ref3 as b_ref3,
case when trim(upper(b.ref1)) = trim(upper(a.ref1)) then 1 else 0 end +
case when trim(upper(b.ref2)) = trim(upper(a.ref2)) then 1 else 0 end +
case when trim(upper(b.ref3)) = trim(upper(a.ref3)) then 1 else 0 end
as match_count
from a cross join b
)
, ranked as
(
select
match_counted.*,
rank() over(partition by id order by match_count desc) as rnk
from match_counted
where match_count > 0
)
select
id, a_ref1, a_ref2, a_ref3, b_ref1, b_ref2, b_ref3,
case match_count
when 3 then 'Exact Match'
when 2 then 'Partial Match 1'
when 1 then 'Partial Match 2'
else 'No Match'
end as match_type
from ranked
where rnk = 1
order by a.id; 

你可以用CASE WHEN…然后在你的SELECT子句中实现你的匹配规则。

SELECT * 
FROM (
Select Distinct A.Id, A.Ref1, A.Ref2, A.Ref3, B.Ref1, B.Ref2, B.Ref3,
CASE 
WHEN A.Ref1 = B.Ref1 AND A.Ref2 = B.Ref2 AND A.Ref3 = B.Ref3
THEN 'Exact Match'
WHEN A.Ref1 = B.Ref1 AND A.Ref2 = B.Ref2
THEN 'Partial Match 1'
WHEN A.Ref2 = B.Ref2 AND A.Ref3 = B.Ref3
THEN 'Partial Match 1'
WHEN A.Ref1 = B.Ref1 AND A.Ref3 = B.Ref3
THEN 'Partial Match 1'
WHEN (A.Ref1 = B.Ref1 OR A.Ref2 = B.Ref2 OR A.Ref3 = B.Ref3)
THEN 'Partial Match 2'
ELSE NULL
END Match
from A
Join B
On  TRIM(UPPER(B.Ref1)) = TRIM(UPPER(A.Ref1)) And 
TRIM(UPPER(B.Ref2)) = TRIM(UPPER(A.Ref2)) And
TRIM(UPPER(B.Ref3)) = TRIM(UPPER(A.Ref3))
) subquery
WHERE Match = 'Partial Match 1'

这将从结果集中获取所需的项,省略不需要的项。

SQL是关于集合和子查询的。

这有点啰嗦,但是你的匹配规则也是。

最新更新