我有这个查询,它需要很长时间(大约10分钟(才能完成。
SELECT COUNT(DISTINCT(column)) var,
column2 var2
FROM table
WHERE column<>'X' and
column2 between 'Y' and 'Z'
GROUP BY var2
ORDER BY var DESC
有关于如何优化速度的想法吗?我尝试了索引,但仍然很慢。也许它们设置不正确。如果重要的话,Y和Z是时间戳,X是这个查询根本不需要的东西,但在表中,因为它是同一应用程序的其他查询所需要的。这个表很大,有数百万行,而且还没有增长。
编辑:以下是示例的EXPLAIN结果:
mysql> EXPLAIN SELECT COUNT(DISTINCT(ip)) v, geo n from idevaff_iptracking where geo<>'XX' and stamp between '1525122000' and '1543615199' group by n order by v desc;
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
| 1 | SIMPLE | idevaff_iptracking | range | stamp,geo,geo_stamp_ip | geo_stamp_ip | 9 | NULL | 3469323 | Using where; Using index; Using temporary; Using filesort |
+------+-------------+--------------------+-------+------------------------+--------------+---------+------+---------+-----------------------------------------------------------+
1 row in set (0.00 sec)
表中的位置如下:
id、acct_id、ip、refer、stamp、hit_time、hit_date、src1、src2、split、sub_id、tid1、tid2、tid3、tid4、target_url、geo。
索引如下:
mysql> SHOW INDEX FROM idevaff_iptracking
-> ;
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| idevaff_iptracking | 0 | PRIMARY | 1 | id | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_ip | 1 | acct_id | A | 2 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_ip | 2 | ip | A | 6775984 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | ip | 1 | ip | A | 6775984 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | stamp | 1 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id | 1 | acct_id | A | 4 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | geo | 1 | geo | A | 440 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | tid1 | 1 | tid1 | A | 276 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | tid2 | 1 | tid2 | A | 514 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | tid3 | 1 | tid3 | A | 34 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | tid4 | 1 | tid4 | A | 5623 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_stamp_ip | 1 | acct_id | A | 744 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_stamp_ip | 2 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_stamp_ip | 3 | ip | A | 6775984 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | geo_stamp_ip | 1 | geo | A | 22362 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | geo_stamp_ip | 2 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | geo_stamp_ip | 3 | ip | A | 6775984 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid1_stamp | 1 | acct_id | A | 658 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid1_stamp | 2 | tid1 | A | 11866 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid1_stamp | 3 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid2_stamp | 1 | acct_id | A | 2 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid2_stamp | 2 | tid2 | A | 18666 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid2_stamp | 3 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid3_stamp | 1 | acct_id | A | 2 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid3_stamp | 2 | tid3 | A | 1832 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid3_stamp | 3 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid4_stamp | 1 | acct_id | A | 2 | NULL | NULL | | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid4_stamp | 2 | tid4 | A | 5060 | NULL | NULL | YES | BTREE | | |
| idevaff_iptracking | 1 | acct_id_tid4_stamp | 3 | stamp | A | 6775984 | NULL | NULL | | BTREE | | |
+--------------------+------------+--------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
29 rows in set (0.00 sec)
添加此复合索引:
INDEX(column2, column)
如果这还不够,我们需要查看SHOW CREATE TABLE
以便进一步讨论。(geo_stamp_ip
没有那么好。(
在列之间展开数组(tid's
(通常是错误的。
EXPLAIN FORMAT=JSON
SELECT COUNT(DISTINCT ip) v, geo n
from idevaff_iptracking
where geo<>'XX'
and stamp between '1525122000' AND '1543615199'
group by n
order by v desc;
有些索引是多余的。通常,如果您有INDEX(a,b)
,则可以删除INDEX(a)
。(例如:acct_id_ip
(