10亿行表筛选和联接

这里有两个表，每个表有大约1B行。我正在尝试查询这些表，以便处理数据并将它们插入其他表中。但我不需要处理所有的~1B。我每次只需要10或100行就可以工作了。

但是，联接和筛选需要很长时间才能返回数据。

表上没有非聚集索引，但主键上有聚集索引。

查询示例：

select Col1, Col2, Col3 
from 1b_table_1 t1
inner join (
select * 
from 1b_table_2 
where expression=condition
) t2
on t1.join_col = t2.join_col
where CAST(t2.timestamp as date) >= date_var1
and CAST(t2.timestamp as date) <= date_var2

UPDATE我尝试在1b_table_1上添加一个非聚集索引，但我现在遇到的问题是，有一个脚本在其他地方运行，它正在不断地将数据插入这两个表中，我无法创建新索引，或者它将在构建索引时锁定表，数据写入将开始失败，并将导致数据丢失。

另一个

SELECT count(*) from 1b_table_1

~1.2B

SELECT count(*) from 1b_table_2

~22M

SELECT count(*) from 1_table_2 where col like condition_string

已运行超过5分钟，但没有结果。这里的列是nvarchar(max(！！

此外，我无法更改表的结构或索引。

首先，子查询是不必要的。您可以将查询写成：

select Col1, Col2, Col3 
from 1b_table_1 t1 join
1b_table_2 t2
on t1.join_col = t2.join_col
where t2.expression = t2.condition and
t2.where_col1 = where_exp1 and
t2.where_col2 = where_exp2;

然后，对于这个查询，您需要在以下位置建立索引：

1b_table_2(where_col1, where_col2, + columns in the "expression", join_col)
CCD_ 2

相关内容

最新更新

热门标签：