我有一个MySQL表,我正在上面执行查询。查询在某些实例中需要很长时间 ~15 分钟才能返回结果,但在其他实例中,它会在毫秒内返回结果。 这两个查询仅在 where 子句中的列值上有所不同。
表语法
CREATE TABLE `tests` (
`id` varchar(36) NOT NULL,
`some_other_id` varchar(36) NOT NULL,
`col_1` varchar(64) NOT NULL,
`col_2` varchar(128) DEFAULT NULL,
`col_3` varchar(64) DEFAULT NULL,
`status` varchar(32) NOT NULL,
`created_at_epoch` bigint(20) NOT NULL,
`updated_at_epoch` bigint(20) NOT NULL,
`updated_by` varchar(64) NOT NULL,
`version` int(11) NOT NULL,
`col_4` text,
`col_5` varchar(64) DEFAULT NULL,
PRIMARY KEY (`id`),
KEY `some_other_id_col_1_col_2_idx` (`some_other_id`,`col_1`,`col_2`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
id和 some_other_id 是使用时间戳和随机字符创建的,之后,some_other_id的示例是"15632901521370150qGUCAQpVuUWK-bJg">
该表有 ~6000 万条记录,包含 ~56 GB 的数据。
请注意以下查询中some_other_id的值。
select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_1**' and (test.status in ('activated')) and test.id>='' order by test.id limit 2;
--Executes within milliseconds.
--Explain plan gives key as "some_other_id_col_1_col_2_idx".
select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_1**' and (test.status in ('activated')) and test.id>='' order by test.id limit 1;
--Takes ~14-15 minutes.
--Explain plan gives key as "PRIMARY".
select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_1**' and (test.status in ('activated')) and test.id>='' order by test.id limit 3;
--Executes within milliseconds.
--Explain plan gives key as "some_other_id_col_1_col_2_idx".
select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_2**' and (test.status in ('activated')) and test.id>='' order by test.id limit 2;
--Takes ~14-15 minutes.
--Explain plan gives key as "PRIMARY".
select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_2**' and (test.status in ('activated')) order by test.id limit 2;
--Takes ~14-15 minutes.
--Explain plan gives key as "PRIMARY".
select test.id, test.col_3, test.col_5, test.created_at_epoch, test.col_2, test.col_1, test.col_4, test.status, test.some_other_id, test.updated_at_epoch, test.updated_by, test.version from tests test where test.some_other_id='**VAL_2**' and (test.status in ('activated')) and test.id>='' limit 2;
--Executes within milliseconds.
--Explain plan gives key as "some_other_id_col_1_col_2_idx".
我无法理解这里的行为,正在寻找一些关于这种情况如何发生的解释。 我正在使用 MySQL 5.6
添加此复合索引:
INDEX(status, some_other_id, id) -- in this order
对于 56GB 的数据,您应该认真考虑规范化和其他缩小表大小的技术。status
是这方面的主要候选人。TINYINT UNSIGNED
仅占用 1 个字节,并提供 256 个值。ENUM
可能是一个可行的替代方案。
updated_by
是另一个可能缩小的事情。
如果这些epochs
只是秒,请不要使用 8 字节BIGINT
。
要进一步调查性能异常,请提供每个性能异常EXPLAIN FORMAT=JSON SELECT ...
以及"优化器跟踪"。