Mysql通过索引将小(20k)行表缓慢连接到大(5mil)行表



我有两个表,devices(20k(行和device_logins(5 mil行(,每个device_logns行都有一个device_id,其中包含一个外键和链接到设备的索引。

我试图使用MySQL创建一个每周在哪些应用程序中使用设备以及从哪些应用程序使用设备的列表,但在应用程序中快速执行哪些堆栈大约需要3秒,从我所读到的数据来看,这些数据不足以证明花那么长的是合理的

模式为:

[devices]
id int unsigned
user_id int unsigned; foreign to users; index
hardware_type varchar
os_type varchar
os_version varchar
first_use datetime
last_use datetime
deleted_at datetime null
[device_logins]
id int unsigned
user_id int unsigned; foreign to users; index
device_id int unsigned; foreign to devices; index 
application string
login_date datetime

查询是:


SELECT GROUP_CONCAT(DISTINCT appLoginToDeviceInRange.application SEPARATOR ', ') AS dataSource,
weekList.weekStartDate                                                                                                                         AS date,
MIN(devicesInRange.id)                                                                                                                         AS eventId
FROM (
SELECT DATE_FORMAT(date, '%Y-%u')                                                    AS week,
DATE_FORMAT(date - INTERVAL WEEKDAY(date) DAY, '%Y-%m-%d')                    AS weekStartDate,
DATE_FORMAT((date - INTERVAL WEEKDAY(date) DAY) + INTERVAL 6 DAY, '%Y-%m-%d') AS weekEndDate
FROM (
SELECT '2021-07-11' - INTERVAL (a.a + (10 * b.a) + (100 * c.a) + (1000 * d.a)) DAY AS DATE
FROM (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS a
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS b
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS c
CROSS JOIN (SELECT 0 AS a UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) AS d ORDER BY DATE DESC) AS `dateList`
GROUP BY `week`
ORDER BY `weekStartDate` DESC
) AS `weekList`
INNER JOIN (
SELECT `devices`.*
FROM devices
INNER JOIN `users`
ON `users`.`id` = `devices`.`user_id`
AND users`.`id` IN (13368)
WHERE `hardware_type` = 'MOBILE'
) AS `devicesInRange`
ON `devicesInRange`.`first_use` <= `weekList`.`weekEndDate` AND `devicesInRange`.`last_use` >= `weekList`.`weekStartDate` AND
(`devicesInRange`.`deleted_at` IS NULL OR `devicesInRange`.`deleted_at` >= `weekList`.`weekStartDate`)
INNER JOIN (
SELECT DISTINCT `device_id`, `application`, YEARWEEK(login_date) AS loginWeek
FROM `deviceLogin`
WHERE `device_id` IS NOT NULL
) AS `appLoginToDeviceInRange`
ON `appLoginToDeviceInRange`.`device_id` = `devicesInRange`.`id`
AND `appLoginToDeviceInRange`.`loginWeek` >= YEARWEEK(weekList.weekStartDate)
AND `appLoginToDeviceInRange`.`loginWeek` <= YEARWEEK(weekList.weekEndDate)
WHERE `weekList`.`weekStartDate` < '2021-07-16 15:02:09.176280'
GROUP BY `os_type`, `week`
ORDER BY `weekList`.`weekStartDate` DESC, `os_type` DESC
LIMIT 20

删除我的500万行表的连接会花费80毫秒,正如你所期望的那样,我已经运行了mysqltuner并相应地配置了

  • 预构建dateList,深入未来;在该日期列上具有PRIMARY KEY;CCD_ 3,其中CCD_。

  • 建立并维护一个";摘要";包含每一天计数的表。然后用它来计算所需的一天(或一周(。更多信息:http://mysql.rjweb.org/doc.php/summarytables

  • 如果你可能想要";天";,则使用";DAY";在上述建议中;周";。即使是阅读每日摘要来获得每周金额也会相当快。(而且比像现在这样从原始数据中读取要快得多。(

  • 如果'2021-07-16 15:02:09.176280'来自NOW(6),那么您也可以简单地说NOW(6),或者可能是CURDATE()。(没有速度提升,只有清晰度。(

  • CCD_ 9通常是低效的;思考其他一些提法是否可行。

  • 如果一周没有数据,也许您希望FROM weekList LEFT JOIN ...提供零?(我认为,你所拥有的只是将这几周排除在外。((不是为了性能,而是为了"正确性"。(

  • 请用hardware_typeos_type等所在的表来限定它们。这样可以更容易地了解发生了什么,尤其是在试图确定最佳索引时。(INDEX不能跨越两个表。(

  • 请提供SHOW CREATE TABLE,以便我可以帮助编制索引。

  • xy不同时,GROUP BY x ORDER BY y需要一个额外的临时表和排序。您的

    GROUP BY  `os_type`, `week`
    ORDER BY  `weekList`.`weekStartDate` DESC, `os_type` DESC
    

可能会变成

GROUP BY  `weekList`.`weekStartDate`,      `os_type`
ORDER BY  `weekList`.`weekStartDate` DESC, `os_type` DESC
  • 把那些东西修好;然后回来寻求更多的建议,如果它仍然是";太慢";。(会有很多变化,最好再开始提问。(

最新更新