mysql ORDER BY与CASE-太慢，更快的方式

看看像这样的ORDER BY和CASE的各种答案，我发现我在这个遗留应用程序中被迫做的可能是一种专家方法；然而，当行不太重要时(100000或更多的行会导致10秒的页面加载)，速度太慢。

请注意，原始查询试图解决一个明显常见的问题，即查询分析师需要与正常排序方式相反的空排序日期。在这种情况下，datefirstprinted是递减的，但所有未打印的记录都应填充到列表的顶部。

OriginalQuery解决了这个问题，但问题的关键是避免派生列notprintedyet带来的filesort性能问题。

原始查询

SELECT SQL_NO_CACHE
id, daterun, datefirstprinted,
case datefirstprinted when "0000-00-00 00:00:00" then 1 else 0 end as notprintedyet
FROM
patientrecords
WHERE
dateuploaded <> '0000-00-00 00:00:00'
ORDER BY
notprintedyet desc,                                 /* ordered via alias */
datefirstprinted desc
LIMIT 10;

时间1.52s

我发现不在别名notprintedyet上排序会节省一点：

查询稍快

SELECT SQL_NO_CACHE
id, daterun, datefirstprinted,
case datefirstprinted when "0000-00-00 00:00:00" then 1 else 0 end as notprintedyet
FROM
patientrecords
WHERE
dateuploaded <> '0000-00-00 00:00:00'
ORDER BY
datefirstprinted = "0000-00-00 00:00:00" desc,      /* directly ordered */
datefirstprinted
LIMIT 10;

时间1.37s

最佳速度，但缺少所需的空日期排序第一

SELECT SQL_NO_CACHE
id, daterun, datefirstprinted,
case datefirstprinted when "0000-00-00 00:00:00" then 1 else 0 end as notprintedyet
FROM
patientrecords
WHERE
dateuploaded <> '0000-00-00 00:00:00'
ORDER BY                        
datefirstprinted                                     /* not ordered properly */
LIMIT 10;

时间0.48s

我尝试使用视图

create view notprinted_patientrecords as (
SELECT id, daterun, datefirstprinted, case datefirstprinted when "0000-00-00 00:00:00" then 1 else 0 end notprintedyet
FROM patientrecords
WHERE dateuploaded <> '0000-00-00 00:00:00'
);

不幸的是，当我运行explain时

explain select * from notprinted_patientrecords order by notprintedyet desc limit 10;

这表明我仍在使用filesort，并且需要1.51s，也就是说根本没有节省

如果datefirstprinted默认值为NULL会更快吗？

可能，但在这个遗留应用程序中，这可能比页面加载时间增加5秒造成的伤害更大

我们还能尝试什么？存储过程？功能？

更新

如建议@草莓-按案例订购

...
ORDER BY                        
case datefirstprinted when "0000-00-00 00:00:00" then 1 else 0 end, datefirstprinted
LIMIT 10;

时间1.52s

根据@e4c5的请求，explain输出：

*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: patientrecords
type: range
possible_keys: dateuploaded,uploads_report
key: dateuploaded
key_len: 5
ref: NULL
rows: 299095
Extra: Using index condition; Using filesort

除了没有正确排序，其具有以下差异

rows: 10
Extra: Using where

创建表语句

*************************** 1. row ***************************
Table: patientrecords
Create Table: CREATE TABLE `patientrecords` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`datecreated` datetime NOT NULL,
`dateuploaded` datetime NOT NULL,
`daterun` datetime NOT NULL,
`datebilled` datetime NOT NULL,
`datefirstprinted` datetime NOT NULL,
`datelastprinted` datetime NOT NULL,
`client` varchar(5) NOT NULL,
PRIMARY KEY (`id`),
KEY `dateuploaded` (`dateuploaded`),
KEY `daterun` (`daterun`),
KEY `uploads_report` (`dateuploaded`,`client`),
KEY `datefirstprinted` (`datefirstprinted`),
KEY `datelastprinted` (`datelastprinted`)
)

查看您的表，首先要注意的是以下索引是多余的

KEY `dateuploaded` (`dateuploaded`),

它的角色可以由这个来完成

KEY `uploads_report` (`dateuploaded`,`client`),

所以让我们放下dateuploaded键。目前还不清楚您是否真的在任何查询中使用了client列。如果你不这样做，我相信如下改变你的指数会大大加快

KEY `uploads_report` (`dateuploaded`,`datefirstprinted`,`client`),

这是因为mysql每个表只能使用一个索引。由于where子句中使用了dateuploaded列上的索引，因此无法使用datefirstprinted的索引。但是，如果将这两列合并到同一索引中，则它可以用于排序和where。

在你做了上面的索引之后，这个可能会被删除：

KEY `datefirstprinted` (`datefirstprinted`),

索引越少，插入和更新就越快。

根据@e4c5在级联索引方面学到的想法，我尝试在两列(基于where的列和基于case的order子句中使用的列)上添加一个键：

alter table
patientrecords
add index
printedvsuploaded (datefirstprinted, dateuploaded);

由于mysql继续使用索引dateuploaded，这最初没有任何效果。

然而，添加force index减少了查询时间：

SELECT SQL_NO_CACHE
id, daterun, datefirstprinted
FROM
patientrecords
FORCE INDEX (printedvsuploaded)
WHERE
dateuploaded <> '0000-00-00 00:00:00'
ORDER BY
case when datefirstprinted = "0000-00-00 00:00:00" then 1 else 0 end desc,
datefirstprinted
LIMIT 10;

时间0.64秒

值得注意的是，我同意@e4c5的观点，即额外的索引最终会导致写操作的性能受到影响；我指望其他路线图的开发来帮助减少索引数量。目前，实现这一点将把较大结果集的10秒页面负载减少到可管理的3秒范围，然后就是将要实现的解决方案

相关内容

最新更新

热门标签：