在大表中匹配ANY的SQL高效方法

  • 本文关键字:SQL 高效 方法 ANY sql
  • 更新时间 :
  • 英文 :


我正在连接一个小表和一个非常大的表,如果有任何项匹配,我想返回一个不同的项。这张桌子太大了,有些东西要花几个小时,我认为应该花几秒钟。

问题是我是";迭代";在第二个表中的每一个条目上。我希望能够";中断";一旦满足某个条件并返回该值,而不是在每个帐户上继续。

在下面的代码中,我为我加入的每个name查找每一行,尽管我只返回DISTINCT example.name,并不关心每一行。在执行INNER JOIN之后,在找到new_ex.data = ...的第一个实例之后,如何返回DISTINCT.name

SELECT DISTINCT example.name
FROM (
SELECT DISTINCT ex.user AS name
FROM exampleTable ex
WHERE ex.timestamp >= '2022-01-01'
AND ex.group = 'test'
AND new_ex.data = '123'
) AS example_users
INNER JOIN exampleTable new_ex on example_users.name = new_ex.user
AND new_ex.timestamp >= '2022-01-01'
AND (
OR new_ex.data = 'abc'
OR new_ex.data = 'def'
OR new_ex.data = 'ghi'
-- ~10 more of these OR statements
)

如果没有看到数据,很难确定这不能进一步简化,但我认为你至少可以将其归结为

select distinct ex.user as name
from exampleTable ex
where ex.timestamp >= '2022-01-01'
and ex.group = 'test'
AND new_ex.data = '123'
and exists (
select 1
from exampleTable new_ex 
where new_ex.user=ex.name
and new_ex.data = '123'
and new_ex.timestamp >= '2022-01-01'
and new_ex.data in ('abc','def','ghi'...)
)

使用以下查询,使用多个OR将导致性能问题。而是使用IN.

select DISTINCT ex.user from exampleTable ex
INNER JOIN exampleTable new_ex on example_users.user = new_ex.user
where ex.timestamp >= '2022-01-01'
AND ex.group = 'test'
AND new_ex.timestamp >= '2022-01-01'
AND new_ex.data in ('abc', 'def', 'ghi'); -- include all your values

您也可以使用以下查询,

select DISTINCT ex.user from exampleTable ex
INNER JOIN (select distinct user, timestamp, data from exampleTable) new_ex on example_users.user = new_ex.user
where ex.timestamp >= '2022-01-01'
AND ex.group = 'test'
AND new_ex.timestamp >= '2022-01-01'
AND new_ex.data in ('abc', 'def', 'ghi'); -- include all your values

最新更新