试图完成一个web进程,我得到了错误canceling statement due to statement timeout
。调试代码库时发现,由于数据集太大,下面的查询超时了。对于如何提高以下查询性能的任何建议,我将不胜感激。
select userid, max(recent_activity_date) recent_activity_date
from (
SELECT id AS userid,
recent_logged_in AS recent_activity_date
FROM user
WHERE recent_logged_in > now() - cast('10 days' AS INTERVAL)
UNION
SELECT userid AS userid, max(recentaccessed) AS recent_activity_date
FROM tokencreds
WHERE recentaccessed > now() - cast('10 days' AS INTERVAL)
GROUP BY userid
) recent_activity
WHERE EXISTS(select 1 from user where id = userid and not deleted)
group by userid
order by userid;
每个表的索引:
- 表用户:
user_recent_logged_in on user (recent_logged_in)
- 表标记:
tokencreds_userid_token on tokencreds (userid, token)
tokencreds_userid_token是唯一的
很大程度上取决于数据的"布局"。有很多唱片"热门"吗?用户中是否有大量记录?在象征意义上?等等。
就我个人而言,我会选择这个:
SELECT userid, max(recent_activity_date) recent_activity_date _user
FROM (
SELECT id AS userid, MAX(recent_logged_in) AS recent_activity_date
FROM user
WHERE recent_logged_in > now() - cast('10 days' AS INTERVAL)
AND NOT deleted
UNION ALL
SELECT userid AS userid, MAX(recentaccessed) AS recent_activity_date
FROM tokencreds
WHERE recentaccessed > now() - cast('10 days' AS INTERVAL)
AND EXISTS(SELECT * FROM user WHERE id = userid AND NOT deleted)
) recent_activity
GROUP BY userid
ORDER BY userid;
-- indexes 'needed' on :
CREATE INDEX idx_userid_not_deleted ON user (userid) WHERE NOT deleted;
CREATE INDEX idx_recent_logged_in_user_id_not_deleted ON user (recent_logged_in, userid) WHERE not deleted;
CREATE INDEX idx_recentaccessed_user_id ON tokencreds (recentaccessed, userid);
而是YMMV为了得到更好的想法,您真的应该提供完整的EXPLAIN ANALYZE结果,否则我们只是盲目猜测。可能是系统会拒绝使用任何建议的索引,在这种情况下,你最好先删除,然后再偏离正轨。
推理:
UNION
将在子选择上引起一个隐式的不同,而您并不真正需要它,因为稍后的MAX()
和GROUP BY
会做同样的事情,为什么要做两次呢- 最好"尽快"过滤,而不是最终过滤IMHO(**(
**:请注意,这里的结果会有所不同!(但我认为我的"更好"(。例如,假设user_id 5 有3条记录
- user_id=5,deleted=true,recent_activity_date=12月10日
- user_id=5,deleted=false,recent_activity_date=12月8日
- user_id=5,deleted=false,recent_activity_date=12月5日
忽略tokencreds
表,user_id 5的结果在您的版本中为12月10日,而在我的版本中则为12月8日。检查您想要的要求!
编辑:建议索引中的错误
去掉union
和exists()
,并将它们组合成一个直的join
:
SELECT x.userid
, GREATEST(x.recent_logged_in, x.recent_activity_date ) AS recent_activity_date
FROM (
SELECT u.id AS userid
, u.recent_logged_in
, MAX(t.recentaccessed) AS recent_activity_date
FROM users u
LEFT JOIN tokencreds AS t ON t.userid = u.id
WHERE NOT u.deleted
AND (
u.recent_logged_in > now() - cast('10 days' AS INTERVAL)
OR t.recentaccessed > now() - cast('10 days' AS INTERVAL)
)
GROUP BY u.id
) x
ORDER by userid;