我对以下查询有问题…载入要花很长时间……有时是10,有时是70,这对我来说简直太疯狂了。
我正在做的是抓取各种活动/表的计数。我已经读到,我不应该把条件放在ON子句,因为这些是外部连接,但是,如果我不把它们放在哪里,那么我最终没有结果……这是有意义的,因为WHERE是整个整体,我需要每个连接的结果。
话虽如此,为什么这个查询花了这么长时间?这个计划似乎运转正常。整个数据库只有700KiB,但这个查询总是写入磁盘,有时tmp文件的大小超过10gig…这怎么可能呢?根据计划,我得到了我所需要的,几乎没有任何行/信息。
使用innodb for db.
任何帮助都将是非常感激的,因为目前这个问题对我来说根本没有意义。
SELECT
users.user_id as DT_RowId,
users.username as username,
computers.computer_name as computer_name,
count(distinct log1.activity_id) as log1s,
count(distinct log2.activity_id) as log2s,
count(distinct log3.activity_id) as log3s,
count(distinct log4.activity_id) as log4s,
count(distinct log5.activity_id) as log5s,
count(distinct log6.activity_id) as log6s,
count(distinct log7.activity_id) as log7s,
count(distinct log8.activity_id) as log8s,
count(distinct log9.activity_id) as log9s,
count(distinct log10.activity_id) as log10s,
count(distinct log11.activity_id) as log11s
FROM computers
INNER JOIN users
on users.computer_id = computers.computer_id
LEFT JOIN log1
on log1.user_id = users.user_id
LEFT JOIN log2
on log2.user_id = users.user_id
LEFT JOIN log3
on log3.user_id = users.user_id
LEFT JOIN log4
on log4.user_id = users.user_id
LEFT JOIN log5
on log5.user_id = users.user_id
LEFT JOIN realtime_logs AS log6
on log6.user_id = users.user_id AND log6.event_title = 'test1'
LEFT JOIN realtime_logs AS log7
on log7.user_id = users.user_id AND log7.event_title = 'test2'
LEFT JOIN realtime_logs AS log8
on log8.user_id = users.user_id AND log8.event_title = 'test3'
LEFT JOIN realtime_logs AS log9
on log9.user_id = users.user_id AND log9.event_title = 'test4'
LEFT JOIN realtime_logs AS log10
on log10.user_id = users.user_id AND log10.event_title = 'test5'
LEFT JOIN realtime_logs AS log11
on log11.user_id = users.user_id AND log11.event_title = 'test6'
WHERE computers.account_id = :cw_account_id AND computers.status = :cw_status
GROUP BY users.user_id
计划:
computers 1 SIMPLE ref PRIMARY,unique_filter,status unique_filter 4 const 5 Using where; Using temporary; Using filesort
users 1 SIMPLE ref unique_filter unique_filter 4 stephen_inno.computers.computer_id 1 Using index
log1 1 SIMPLE ref user_id user_id 4 stephen_inno.users.user_id 1 Using index
log2 1 SIMPLE ref user_id user_id 4 stephen_inno.users.user_id 1 Using index
log3 1 SIMPLE ref user_id user_id 4 stephen_inno.users.user_id 1 Using index
log4 1 SIMPLE ref user_id user_id 4 stephen_inno.users.user_id 1 Using index
log5 1 SIMPLE ref user_id user_id 4 stephen_inno.users.user_id 1 Using index
log6 1 SIMPLE ref user_id user_id 771 stephen_inno.users.user_id,const 3 Using index
log7 1 SIMPLE ref user_id user_id 771 stephen_inno.users.user_id,const 3 Using index
log8 1 SIMPLE ref user_id user_id 771 stephen_inno.users.user_id,const 3 Using index
log9 1 SIMPLE ref user_id user_id 771 stephen_inno.users.user_id,const 3 Using index
log10 1 SIMPLE ref user_id user_id 771 stephen_inno.users.user_id,const 3 Using index
log11 1 SIMPLE ref user_id user_id 771 stephen_inno.users.user_id,const 3 Using index
编辑:使用下面Gordon的解决方案。我必须说,它现在运行得非常快,然而,我在计划中确实看到了一些令人惊讶的东西。下面是其中一个左连接表的计划:
<derived2> 1 PRIMARY ALL 7
log5 2 DERIVED index user_id 775 1304 Using index
派生类型为'all',读取7行。子查询具有索引类型,读取1304行。1304是这个数据库的总行数…不管它是否属于这个查询- WHERE computers.account_id = :cw_account_id AND computers.status = :cw_status
。所以,除非我读错了,否则这似乎是行不通的……我当然不想为每一个都读取整个数据库。
想法?
编辑2:
关于为这些连接扫描整个表的问题,我认为我可以在我的from中使用子查询,然后连接到它。我的想法是,这个子查询的结果将只包含我想要的用户在我的结果中…因此,连接将只连接这组用户。不幸的是,在查看计划之后,优化器仍然在连接上读取表中的每一行,而不是仅在active_users集上进行连接。
想法?
FROM
(SELECT
user_id,
computer_name,
username
FROM computers
INNER JOIN users
on users.computer_id = computers.computer_id
WHERE computers.account_id = :cw_account_id AND computers.status = :cw_status
) AS active_users
... the left join subqueries (Gordon's answer) follow joining on active_users.user_id
您正在连接多个维度,这是为每个user_id
创建笛卡尔积。编写此查询的更好方法是:
SELECT u.user_id as DT_RowId, u.username as username, c.computer_name as computer_name,
l1.cnt as log1s,
l2.cnt as log2s,
. . .
FROM computers c INNER JOIN
users u
on u.computer_id = c.computer_id LEFT JOIN
(select user_id, count(*) as cnt
from log1
group by user_id
) l1
on l1.user_id = u.user_id LEFT JOIN
(select user_id, count(*) as cnt
from log2
group by user_id
) l2
on l2.user_id = u.user_id
. . .
继续使用left join
来引入其余日志文件的聚合