SQL:如何生成不存在的组合



我想通过两个单词的组合来创建一个巨大的列表。 字典中的每个单词都应与所有其他单词组合。 换句话说,我将有Total^2组合。 每个单词都有一个不同的 ID,我只在组合表中生成两个 ID 的组合。

我想定期检查是否遗漏了任何组合,以便我可以生成并将其添加到数据库中。我发现这个问答可以生成所有可能的组合,但我不知道如何使用SQL查询找到不存在的组合,可能是这样的:

select * from words a ... join words b 
where (a.id, b.id) not in (select * from combinaions) 

如果SQL没有直接的解决方案,请您建议一种以编程方式执行此操作的算法。请注意,可能会缺少一些IDs,因为我删除了一些单词,所以我不能对整数使用线性循环。

组合表有两列(第一个 id,第二个 id(,两个 id 都来自表

您可以使用交叉连接来获得所有可能的组合,然后根据条件,您可以删除已经存在的组合。

Select * from words a cross join words b 
where not exists (select * from combinations c where c.first_id = a.id and c.second_id = b.id) 

正如@VahiD所指出的,CROSS JOIN是拼图的核心部分。您也可以将现有组合表LEFT JOIN到单词CROSS JOIN,并检查NULL(这意味着给定的笛卡尔产品组合在现有组合表中不存在(,而不是使用子查询。

例如:

WITH 
-- sample data (notice that there's no word with ID of 3)
words(word_id, word) AS
(
SELECT 1, 'apple'   UNION ALL
SELECT 2, 'pear'    UNION ALL
SELECT 4, 'orange'  UNION ALL
SELECT 5, 'banana'
)
-- existing combinations
,combinations(first_id, second_id) AS
(
SELECT 1, 2 UNION ALL
SELECT 1, 5 UNION ALL
SELECT 2, 4 UNION ALL
SELECT 2, 5 UNION ALL
SELECT 4, 5
)
-- this is the CTE you'll use to create the cartesian product
-- of all words in your words table. You can also put this as a 
-- sub-query, but I'd argue that a CTE makes it clearer.
,cartesian(w1_id, w1_word, w2_id, w2_word) AS
(
SELECT *
FROM words w1, words w2
)
-- the actual query
SELECT * 
FROM cartesian
LEFT JOIN combinations ON
combinations.first_id = cartesian.w1_id
AND combinations.second_id = cartesian.w2_id
WHERE combinations.first_id IS NULL

现在,一个重要的警告是,当切换word1word2时,此查询不会将组合视为相同。也就是说,(1,2)(2,1)不同。但是,解决此问题就像调整联接一样简单:

SELECT * 
FROM cartesian
LEFT JOIN combinations ON
(combinations.first_id = cartesian.w1_id OR combinations.first_id = cartesian.w2_id)
AND
(combinations.second_id = cartesian.w1_id OR combinations.second_id = cartesian.w2_id)
WHERE combinations.first_id IS NULL

这是另一个选项。 在子查询中构建完整列表,并在组合表的外部左侧查找缺少的内容。

DECLARE @Words TABLE
(
[Id] INT
, [Word] NVARCHAR(200)
);
DECLARE @WordCombo TABLE
(
[Id1] INT
, [Id2] INT
);
INSERT INTO @Words (
[Id]
, [Word]
)
VALUES ( 1, N'Cat' )
, ( 2, N'Taco' )
, ( 3, N'Test' )
, ( 4, N'Cake' )
, ( 5, N'Apple' )
, ( 6, N'Pear' );
INSERT INTO @WordCombo (
[Id1]
, [Id2]
)
VALUES ( 1, 2 )
, ( 2, 6 )
, ( 5, 3 )
, ( 5, 1 );
--select from a sub query that builds out all combinations and then left outer to find what's missing in @WordCombo
SELECT          [fulllist].[Id1]
, [fulllist].[Id2]
FROM            (
--Rebuild full list
SELECT     [a].[Id] AS [Id1]
, [b].[Id] AS [Id2]
FROM       @Words [a]
INNER JOIN @Words [b]
ON 1 = 1
WHERE      [a].[Id] <> [b].[Id] --Would a word be combined with itself?
) AS [fulllist]
LEFT OUTER JOIN @WordCombo [wc]
ON [wc].[Id1] = [fulllist].[Id1]
AND [wc].[Id2] = [fulllist].[Id2]
WHERE           [wc].[Id1] IS NULL;

最新更新