我在MySQL上遇到了大数据问题。
我有:
- 一个包含 59033 行的
users
表,以及 - 一个包含 8753 行的
user_notes
表。
但是当我搜索哪些用户在某些日期有用户注释时。
我的查询是这样的:
SELECT u.*, rep.name as rep_name FROM users as u
LEFT JOIN users as rep on rep.id = u.add_user
LEFT JOIN authorization on authorization.id = u.authorization
LEFT JOIN user_situation_list on user_situation_list.user_situation_id = u.user_situation
WHERE
EXISTS(
select * from user_notes
where user_notes.note_user_id = u.id AND user_notes.create_date
BETWEEN "2017-10-20" AND "2017-10-22"
)
ORDER BY u.lp_modify_date DESC, u.id DESC
转过来——先找到id;稍后再处理连接。
SELECT u.*,
( SELECT rep.name
FROM users AS rep
WHERE rep.id = u.add_user ) AS rep_name
FROM (
SELECT DISTINCT note_user_id
FROM user_notes
WHERE create_date >= "2017-10-20"
AND create_date < "2017-10-20" + INTERVAL 3 DAY
) AS un
JOIN users AS u ON u.id = un.note_user_id
ORDER BY lp_modify_date DESC, id DESC
笔记
- 无需
GROUP BY
; - 2 张表似乎未使用;我删除了它们;
- 我更改了日期范围;
- 用户笔记需要
INDEX(create_date, note_user_id)
; - 请注意我如何将
LEFT JOIN
转换为SELECT
列表中的子查询。
如果可以有多个rep_names
,那么原始查询是"错误的",因为GROUP BY
将选择一个随机名称。 我的答案可以通过rep.name
更改为以下之一来"修复":
MAX(rep.name) -- deliver only one; arbitrarily the max
GROUP_CONCAT(rep.name) -- deliver a commalist of names
重写查询以使用 JOIN 而不是 EXISTS 检查,应该加快速度。如果随后按 user.id 对结果进行分组,则应得到相同的结果:
SELECT u.*, rep.name as rep_name FROM users as u
LEFT JOIN users as rep on rep.id = u.add_user
LEFT JOIN authorization on authorization.id = u.authorization
LEFT JOIN user_situation_list on user_situation_list.user_situation_id = u.user_situation
JOIN user_notes AS un
ON un.note_user_id
AND un.create_date BETWEEN "2017-10-20" AND "2017-10-22"
GROUP BY u.id
ORDER BY u.lp_modify_date DESC, u.id DESC