sqlite中内部联接的查询优化



我有一个包含3个表的数据库,masterInfo、primDescT和secDescT。

CREATE TABLE masterInfo (id INTEGER PRIMARY KEY AUTOINCREMENT,
primDescId INTEGER,
secDescId INTEGER,
category INTEGER,
UNIQUE(primDescId, secDescId, category));
CREATE TABLE primDescT (id INTEGER PRIMARY KEY,
primDesc nvarchar(512));
CREATE TABLE secDescT (id INTEGER PRIMARY KEY,
secDesc nvarchar(512));
INSERT INTO primDescT VALUES(1,'XXXX');
INSERT INTO primDescT VALUES(2,'YYYY');
INSERT INTO primDescT VALUES(3,'ZZZZ');
INSERT INTO primDescT VALUES(4,'SSSS');
INSERT INTO secDescT VALUES(1,'AAA');
INSERT INTO secDescT VALUES(2,'BBB');
INSERT INTO secDescT VALUES(3,'CCC');
INSERT INTO masterInfo VALUES(1,1,1,1);
INSERT INTO masterInfo VALUES(2,2,2,2);
INSERT INTO masterInfo VALUES(3,3,1,1);
INSERT INTO masterInfo VALUES(4,4,3,2);

表,masterInfo有1765137行,primDescT中有312210行,secDescT有105458行。

我已经使用下面的查询来获取结果。

SELECT m.id AS pId, 
primDesc AS pDescr, secDesc AS sDescr, category   AS category 
FROM masterInfo m
INNER JOIN primDescT ON primDescT.id = m.primDescId
INNER JOIN secDescT ON secDescT.id = m.secDescId
WHERE m.category IN ('1','2') ORDER BY pDescr ASC  LIMIT 100 OFFSET 0

以上查询需要8秒才能得到响应。

但如果我将偏移量设置为1756300,那么它将需要53秒。

SELECT m.id AS pId, 
primDesc AS pDescr, secDesc AS sDescr, category   AS category 
FROM masterInfo m
INNER JOIN primDescT ON primDescT.id = m.primDescId
INNER JOIN secDescT ON secDescT.id = m.secDescId
WHERE m.category IN ('1','2') ORDER BY pDescr ASC  LIMIT 100 OFFSET 1756300

如何优化以上查询以在3秒内提取?

这些查询的问题在于ORDER BY:在数据库确定100或1756400中最小的一个之前,必须计算所有结果。解释查询计划输出:

0,0,0,扫描表主信息AS m0,1,1,使用整数主键(rowid=?)搜索表primDescT0,2,2,使用整数主键(rowid=?)搜索表secDescT0,0,0,USE TEMP B-TREE FOR ORDER BY

要删除显式排序步骤,必须对该列进行索引:

CREATE INDEX pd ON primDescT(primDesc);

您必须强制数据库使用此索引(默认情况下,SQLite在估计查询成本时会忽略LIMIT,如果您想要所有结果,不使用pd索引会更快):

SELECT ...
FROM masterInfo m
INNER JOIN primDescT INDEXED BY pd ON primDescT.id = m.primDescId
--                   ^^^^^^^^^^^^^
INNER JOIN secDescT ON secDescT.id = m.secDescId
WHERE ...
ORDER BY pDescr ASC
LIMIT 100 OFFSET ...;
0,0,1,使用覆盖索引pd扫描表primDescT0,1,0,使用COVERING INDEX sqlite_autoindex_masterInfo_1(primDescId=?)搜索表主信息0,2,2,使用整数主键(rowid=?)搜索表secDescT

较大的OFFSET值总是很慢;数据库必须计算并丢弃所有这些行。

如果使用分页,则可以将OFFSET替换为排序列上的查找;这需要保存上一页的最后一个值:

SELECT ...
FROM masterInfo m
INNER JOIN primDescT INDEXED BY pd ON primDescT.id = m.primDescId
INNER JOIN secDescT ON secDescT.id = m.secDescId
WHERE primDesc > :LastValue
--    ^^^^^^^^^^^^^^^^^^^^^
AND ...
ORDER BY pDescr ASC
LIMIT 100 /* no offset */;

最新更新