我已经编写了sql查询:
select `b`.`id` as `id`, `b`.`name`, count(a.institution_id) as total
from `premises` as `a`
left join `institutions` as `b` on `b`.`id` = `a`.`institution_id`
where exists (select id from rental_schedules as c where a.id = c.premises_id and public = 1 and public_enterprise = 0 and rental_schedule_status = 1 and date >= CURDATE())
group by `a`.`institution_id`
我在表中有很大的数据(超过1000000行(,这个查询需要8-10秒。有没有可能更好地优化这个查询?
感谢您提前回答!
institutions
表的联接可以从以下索引中受益:
CREATE INDEX inst_idx (id, name);
此索引将覆盖此表上的联接和select子句。最大的改进将来自rental_schedules
表上的以下索引:
CREATE INDEX rental_idx (premises_id, public, public_enterprise, rental_schedule_status, date);
该索引将允许exists子句快速评估从前两个表联接的每个表。
此外,我会重写您的查询,使其符合ANSI,使GROUP BY
子句中的列与SELECT
子句匹配:
SELECT
b.id AS id,
b.name, -- allowed, assuming that id be the primary key column of institutions
COUNT(a.institution_id) AS total
FROM premises AS a
LEFT JOIN institutions AS b ON b.id = a.institution_id
WEHRE EXISTS (SELECT 1 FROM rental_schedules AS c
WHERE a.id = c.premises_id AND public = 1 AND
public_enterprise = 0 AND rental_schedule_status = 1 AND
date >= CURDATE())
GROUP BY
b.id;
您可以尽可能在内存中处理子查询。当内存空间不足时,会创建一个临时内存,浪费很长时间。MySQL文档如下所述:
优化器使用物化来实现更高效的子查询处理。物化通过将子查询结果生成为临时表(通常在内存中(来加快查询执行。MySQL第一次需要子查询结果时,会将结果具体化为一个临时表。任何后续需要结果的时候,MySQL都会再次引用临时表。优化器可以使用哈希索引对表进行索引,以使查找快速且廉价。索引包含唯一的值,以消除重复并使表变小。
子查询物化在可能的情况下使用内存中的临时表,如果表太大,则返回到磁盘上存储。