MYSQL:使用group-by和内部联接优化MYSQL查询



我有一张表,里面有19489578条帖子。当我用select语句查询表时,按postings表的lockroomId字段分组,它的查询时间非常慢(>3秒)。这是故事表的模式:

CREATE TABLE `Story` (
`storyId` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`storyType` int(11) DEFAULT '0',
`parentStoryId` bigint(20) unsigned DEFAULT NULL,
`ownerId` bigint(20) unsigned NOT NULL,
`lockroomId` bigint(20) unsigned DEFAULT NULL,
`isHost` tinyint(1) DEFAULT '0',
`updatedAt` timestamp(6) NULL DEFAULT CURRENT_TIMESTAMP(6),
`createdAt` timestamp(6) NULL DEFAULT CURRENT_TIMESTAMP(6),
`message` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
`imageUrl` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`amount` int(11) DEFAULT NULL,
`location` varchar(45) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`privacy` tinyint(1) NOT NULL DEFAULT '0',
`minutes` int(11) DEFAULT '0',
`hasCoinBadge` tinyint(1) DEFAULT '0',
`hasFriendBadge` tinyint(1) DEFAULT '0',
`localDBId` text CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci,
`tagName` varchar(63) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT '',
`tagColor` varchar(15) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT '',
`tagId` bigint(20) DEFAULT NULL,
`numLikes` int(11) DEFAULT '0',
`numComments` int(11) DEFAULT '0',
`lastCommenterId` bigint(20) unsigned DEFAULT NULL,
`lastCommentTimestamp` timestamp(6) NULL DEFAULT CURRENT_TIMESTAMP(6),
`lastComment` varchar(256) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`secondLastCommenterId` bigint(20) unsigned DEFAULT NULL,
`secondLastCommentTimestamp` timestamp(6) NULL DEFAULT CURRENT_TIMESTAMP(6),
`secondLastComment` varchar(256) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT NULL,
`lastLikerId` bigint(20) unsigned DEFAULT NULL,
PRIMARY KEY (`storyId`),
UNIQUE KEY `StoryId_UNIQUE` (`storyId`),
KEY `ownerId` (`ownerId`),
KEY `parentStoryId` (`parentStoryId`),
KEY `updatedAt` (`updatedAt`),
KEY `ownerId_updatedAt` (`ownerId`,`updatedAt`),
KEY `updatedAt_ownerId_descending_index` (`updatedAt` DESC,`ownerId`),
KEY `lockroomId` (`lockroomId`),
CONSTRAINT `ownerId` FOREIGN KEY (`ownerId`) REFERENCES `User` (`userId`),
CONSTRAINT `parentStoryId` FOREIGN KEY (`parentStoryId`) REFERENCES `Story` (`storyId`)
) ENGINE=InnoDB AUTO_INCREMENT=19503337 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci
+----------------------------+---------------------+------+-----+----------------------+-------------------+
| Field                      | Type                | Null | Key | Default              | Extra             |
+----------------------------+---------------------+------+-----+----------------------+-------------------+
| storyId                    | bigint(20) unsigned | NO   | PRI | NULL                 | auto_increment    |
| storyType                  | int(11)             | YES  |     | 0                    |                   |
| parentStoryId              | bigint(20) unsigned | YES  | MUL | NULL                 |                   |
| ownerId                    | bigint(20) unsigned | NO   | MUL | NULL                 |                   |
| lockroomId                 | bigint(20) unsigned | YES  | MUL | NULL                 |                   |
| isHost                     | tinyint(1)          | YES  |     | 0                    |                   |
| updatedAt                  | timestamp(6)        | YES  | MUL | CURRENT_TIMESTAMP(6) | DEFAULT_GENERATED |
| createdAt                  | timestamp(6)        | YES  |     | CURRENT_TIMESTAMP(6) | DEFAULT_GENERATED |
| message                    | text                | YES  |     | NULL                 |                   |
| imageUrl                   | varchar(255)        | YES  |     | NULL                 |                   |
| amount                     | int(11)             | YES  |     | NULL                 |                   |
| location                   | varchar(45)         | YES  |     | NULL                 |                   |
| privacy                    | tinyint(1)          | NO   |     | 0                    |                   |
| minutes                    | int(11)             | YES  |     | 0                    |                   |
| hasCoinBadge               | tinyint(1)          | YES  |     | 0                    |                   |
| hasFriendBadge             | tinyint(1)          | YES  |     | 0                    |                   |
| localDBId                  | text                | YES  |     | NULL                 |                   |
| tagName                    | varchar(63)         | YES  |     |                      |                   |
| tagColor                   | varchar(15)         | YES  |     |                      |                   |
| tagId                      | bigint(20)          | YES  |     | NULL                 |                   |
| numLikes                   | int(11)             | YES  |     | 0                    |                   |
| numComments                | int(11)             | YES  |     | 0                    |                   |
| lastCommenterId            | bigint(20) unsigned | YES  |     | NULL                 |                   |
| lastCommentTimestamp       | timestamp(6)        | YES  |     | CURRENT_TIMESTAMP(6) | DEFAULT_GENERATED |
| lastComment                | varchar(256)        | YES  |     | NULL                 |                   |
| secondLastCommenterId      | bigint(20) unsigned | YES  |     | NULL                 |                   |
| secondLastCommentTimestamp | timestamp(6)        | YES  |     | CURRENT_TIMESTAMP(6) | DEFAULT_GENERATED |
| secondLastComment          | varchar(256)        | YES  |     | NULL                 |                   |
| lastLikerId                | bigint(20) unsigned | YES  |     | NULL                 |                   |
+----------------------------+---------------------+------+-----+----------------------+-------------------+

我还添加了lockroomId作为索引。这是故事表的索引:

+-------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name                           | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Story |          0 | PRIMARY                            |            1 | storyId       | A         |    18951234 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          0 | StoryId_UNIQUE                     |            1 | storyId       | A         |    19116608 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          1 | ownerId                            |            1 | ownerId       | A         |     1212686 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          1 | parentStoryId                      |            1 | parentStoryId | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | updatedAt                          |            1 | updatedAt     | A         |    19022172 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | ownerId_updatedAt                  |            1 | ownerId       | A         |      945172 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          1 | ownerId_updatedAt                  |            2 | updatedAt     | A         |    19112336 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | updatedAt_ownerId_descending_index |            1 | updatedAt     | D         |    19091560 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | updatedAt_ownerId_descending_index |            2 | ownerId       | A         |    18649716 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          1 | lockroomId                         |            1 | lockroomId    | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
+-------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

这是一个需要>3秒执行。这个查询的作用是获取用户创建的具有相同lockroomId但不同storyId的jointStory。因此,我将故事按lockroomId分组。

SELECT 
*
FROM
((SELECT 
Story.*,
jointStory.guestIds,
jointStory.guestStoryIds,
jointStory.guestTrees,
jointStory.guestDurations,
jointStory.guestTagIds
FROM
Story
INNER JOIN (SELECT 
lockroomId,
GROUP_CONCAT(Story.ownerId) AS guestIds,
GROUP_CONCAT(Story.storyId) AS guestStoryIds,
GROUP_CONCAT(Story.imageUrl) AS guestTrees,
GROUP_CONCAT(Story.minutes) AS guestDurations,
GROUP_CONCAT(Story.tagId) AS guestTagIds
FROM
Story
WHERE
Story.storyType = 1
AND lockroomId IS NOT NULL
AND createdAt < FROM_UNIXTIME(1600444809)
GROUP BY Story.lockroomId
HAVING FIND_IN_SET(1349147, guestIds) ORDER BY createdAt DESC
LIMIT 10) jointStory ON jointStory.lockroomId = Story.lockroomId
WHERE
Story.storyType = 2
AND Story.lockroomId IS NOT NULL
AND Story.createdAt < FROM_UNIXTIME(1600444809)
ORDER BY createdAt DESC
LIMIT 10) UNION ALL SELECT 
Story.*, NULL, NULL, NULL, NULL, NULL
FROM
Story
WHERE
Story.storyType = 0
AND ownerId = 1349147
AND createdAt < FROM_UNIXTIME(1600444809)
ORDER BY createdAt DESC
LIMIT 10) stories
LEFT JOIN
StoryEngagement ON StoryEngagement.storyId = stories.storyId
AND StoryEngagement.userId = 1349147
ORDER BY createdAt DESC
LIMIT 10;

这个查询的explain扩展命令的结果显示MySQL正在使用filesort和索引条件:

+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------------------+--------+----------+---------------------------------------------------------------+
| id | select_type | table      | partitions | type  | possible_keys | key         | key_len | ref                         | rows   | filtered | Extra                                                         |
+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------------------+--------+----------+---------------------------------------------------------------+
|  1 | PRIMARY     | Story      | NULL       | range | lockroomId    | lockroomId  | 9       | NULL                        | 287540 |     3.33 | Using index condition; Using where; Using MRR; Using filesort |
|  1 | PRIMARY     | <derived2> | NULL       | ref   | <auto_key0>   | <auto_key0> | 9       | flora_test.Story.lockroomId |     10 |   100.00 | NULL                                                          |
|  2 | DERIVED     | Story      | NULL       | range | lockroomId    | lockroomId  | 9       | NULL                        | 287540 |     3.33 | Using index condition; Using where                            |
+----+-------------+------------+------------+-------+---------------+-------------+---------+-----------------------------+--------+----------+---------------------------------------------------------------+

如果您对如何优化此查询有任何想法,我们将不胜感激。如果有一种方法不需要创建索引并使此查询运行得更快,那就太好了,因为这是一个巨大的生产表,添加索引需要很长时间,而且添加一些新索引可能会导致其他查询运行得更慢。我已经尝试了一些事情,比如在groupbyfields(index(createdAt,lockroomId))上创建一个组合索引,并按照一些博客文章的建议使用FORCE INDEX(createdAt,lockroomId),但运行速度更慢。。。

更新

应用该建议后,以下是EXPLAIN SELECE的结果

+----+--------------+-----------------+------------+--------+------------------------------------------------------------------------+--------------------------+---------+-----------------------------+-------+----------+---------------------------------------------------------------------+
| id | select_type  | table           | partitions | type   | possible_keys                                                          | key                      | key_len | ref                         | rows  | filtered | Extra                                                               |
+----+--------------+-----------------+------------+--------+------------------------------------------------------------------------+--------------------------+---------+-----------------------------+-------+----------+---------------------------------------------------------------------+
|  1 | PRIMARY      | <derived2>      | NULL       | ALL    | NULL                                                                   | NULL                     | NULL    | NULL                        |    12 |   100.00 | Using filesort                                                      |
|  1 | PRIMARY      | StoryEngagement | NULL       | eq_ref | PRIMARY                                                                | PRIMARY                  | 16      | const,stories.storyId       |     1 |   100.00 | Using where                                                         |
|  2 | DERIVED      | <derived3>      | NULL       | ALL    | NULL                                                                   | NULL                     | NULL    | NULL                        |    10 |   100.00 | Using where; Using temporary; Using filesort                        |
|  2 | DERIVED      | Story           | NULL       | ref    | lockroomId,combine_lockroomId_Index,combine_story_index                | combine_story_index      | 14      | const,jointStory.lockroomId |    29 |    33.33 | Using index condition                                               |
|  3 | DERIVED      | Story           | NULL       | range  | lockroomId,combine_lockroomId_Index,combine_story_index                | combine_lockroomId_Index | 14      | NULL                        | 36332 |    33.33 | Using index condition; Using where; Using temporary; Using filesort |
|  4 | UNION        | Story           | NULL       | ref    | ownerId,ownerId_updatedAt,combine_lockroomId_Index,combine_story_index | ownerId                  | 8       | const                       |    15 |    16.66 | Using where                                                         |
| NULL | UNION RESULT | <union2,4>      | NULL       | ALL    | NULL                                                                   | NULL                     | NULL    | NULL                        |  NULL |     NULL | Using temporary; Using filesort                                     |
+----+--------------+-----------------+------------+--------+------------------------------------------------------------------------+--------------------------+---------+-----------------------------+-------+----------+---------------------------------------------------------------------+

以下是SHOW TABLE STATUS的结果

+-------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+---------+
| Name  | Engine | Version | Row_format | Rows     | Avg_row_length | Data_length | Max_data_length | Index_length | Data_free | Auto_increment | Create_time         | Update_time | Check_time | Collation          | Checksum | Create_options | Comment |
+-------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+---------+
| Story | InnoDB |      10 | Dynamic    | 19350090 |            201 |  3895443456 |               0 |   4337303552 |   4194304 |       20043428 | 2020-09-18 17:03:14 | NULL        | NULL       | utf8mb4_unicode_ci |     NULL |                |         |
+-------+--------+---------+------------+----------+----------------+-------------+-----------------+--------------+-----------+----------------+---------------------+-------------+------------+--------------------+----------+----------------+---------+

RAM的大小为2 GB。这是innodb_buffer_pool_size

+------------------------------------------+
| @@innodb_buffer_pool_size/1024/1024/1024 |
+------------------------------------------+
|                           1.000000000000 |
+------------------------------------------+

更改列的类型可能是一个好主意,但据我所知,在更改列类型时,表将被锁定。因此,我可能需要等到下一次数据库更新。

这是当前的索引集:

+-------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name                           | Seq_in_index | Column_name   | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+-------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Story |          0 | PRIMARY                            |            1 | storyId       | A         |    19350090 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          0 | StoryId_UNIQUE                     |            1 | storyId       | A         |    19350090 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          1 | ownerId                            |            1 | ownerId       | A         |     1255716 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          1 | parentStoryId                      |            1 | parentStoryId | A         |           1 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | updatedAt                          |            1 | updatedAt     | A         |    19350090 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | ownerId_updatedAt                  |            1 | ownerId       | A         |     1743714 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          1 | ownerId_updatedAt                  |            2 | updatedAt     | A         |    19350090 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | updatedAt_ownerId_descending_index |            1 | updatedAt     | D         |    19139564 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | updatedAt_ownerId_descending_index |            2 | ownerId       | A         |    18636190 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
| Story |          1 | lockroomId                         |            1 | lockroomId    | A         |      328860 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | combine_lockroomId_Index           |            1 | storyType     | A         |         685 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | combine_lockroomId_Index           |            2 | lockroomId    | A         |      221470 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | combine_story_index                |            1 | storyType     | A         |         429 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | combine_story_index                |            2 | lockroomId    | A         |      665208 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | combine_story_index                |            3 | createdAt     | A         |    19350090 |     NULL |   NULL | YES  | BTREE      |         |               | YES     | NULL       |
| Story |          1 | combine_story_index                |            4 | ownerId       | A         |    19350090 |     NULL |   NULL |      | BTREE      |         |               | YES     | NULL       |
+-------+------------+------------------------------------+--------------+---------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+

Number of Like只是存储在列中的一个数字(Int)。以及只存储lastComment和secondLastComment的注释列。我们创建了另一个表Story来存储所有的注释。

CCD_ 7和CCD_。

非常感谢您的回复。

我认为lockroomId IS NOT NULL是不必要的,因为lockroomId > 0

索引:

(storyType, lockroomId)
(storyType, createdAt)

冗余(因为PRIMARY KEYUNIQUE KEY):

UNIQUE KEY `StoryId_UNIQUE` (`storyId`),

是否每个lockroomId都有一个storyType=1和=2?如果是这样的话,我想分3个步骤(而不是2个)进行这个查询:

  1. 做最少的工作来找到10个lookroomID
  2. JOIN故事获取Story.*
  3. 再次获得JOIN以获得其余的GROUP_CONCATs

以下是基本原理。

首先,您要获取整个表,构建GROUP_CONCATs,但除10之外的所有表最终都将被丢弃。相反,我希望只获取访客ID并过滤。第二,把它削成10排。

然后我会回去拿剩下的东西。

这有正确的洗手间ID列表吗?

SELECT  lockroomId, GROUP_CONCAT(Story.ownerId) AS guestIds
FROM  Story
WHERE  storyType = 1
AND  lockroomId > 0
AND  createdAt < FROM_UNIXTIME(1598882406)
GROUP BY  lockroomId
HAVING  FIND_IN_SET(<userId>, guestIds)
ORDER BY  createdAt DESC
LIMIT  10;

有了这个覆盖指数:

INDEX(storyType, lookroomId, createdAt, ownerId)

在我继续之前,请看看它运行的速度有多快,以及它是否得到了一个有效的lockroomId列表。

改进

你说这有帮助。请在索引更改后提供EXPLAIN SELECT

次要建议

你的桌子相当大。请提供SHOW TABLE STATUS、RAM大小和innodb_buffer_pool_size。如果可能,请检查MySQL是否正在交换。这将提供一个线索,说明您是否受I/O限制,从而导致焦点的转移。

我看到很多BIGINTs。CCD_ 24占用8个字节;CCD_ 25取4。如果涉及I/O,缩小数据大小有助于提高性能。

您是否总是需要显示";评论";?也许这6列可以移动到一个单独的表中?

移动";numlikes";到另一个表——这可能是有益的,尤其是当您在进行有问题的大选择时经常更新该列时。

对于minutes之类的东西,可以考虑更小的数据类型。SMALLINT UNSIGNED能撑2个月吗?(只有2个字节。)

你真的使用created_atupdated_at吗?(`TIMESTAMP(6)每个占用8个字节。)

当前的索引集是什么?让我们看看第二差的查询。(我不想在改进当前查询的同时破坏它。)

更多

只有2GB内存的innodb_buffer_pool_size = 1G——听起来你可能在交换。交换对性能非常不利。我建议将设定值降低到500米。

60-80%的建议适用于较大的服务器。操作系统占用一些空间。MySQL代码占用一些空间。其他数据结构占用一些空间。还有网络服务器。以及您的应用程序代码。这些加起来可能超过1GB。(我不太清楚。)这只剩下不到1GB的缓冲池。将其设置为500M,运行几天,然后如果有一些空闲RAM,您可以增加一些buffer_pool。

交换比使用较小的buffer_pool更不利于性能。

由于该表大约为8GB,因此缩小该表非常重要。请参阅上面的建议。

由于combine_lockroomId_Indexcombine_story_index的第一部分,因此DROPcombine_lockroomId_Index

最新更新