连接2个MySQL临时表比连接一个临时表和一个普通表并添加WHERE子句慢50倍

我有两个组，我试图找到交集(需要2列匹配)，我发现连接2个临时表产生的性能比仅连接一个临时表的原始表慢50倍。这对我来说毫无意义，也许有人能给我点化一下?

我是这样写两个临时表的:

CREATE TEMPORARY TABLE attendees (
    event_id SMALLINT(5) UNSIGNED,
    person_id INT(10) UNSIGNED NOT NULL,
    KEY(event_id),
    KEY(person_id)
);
INSERT INTO attendees (event_id, person_id) 
    SELECT event_id, person_id
    FROM attendance WHERE year=2013
    GROUP BY event_id, person_id;
CREATE TEMPORARY TABLE invitees (
    event_id SMALLINT(5) UNSIGNED,
    person_id INT(10) UNSIGNED NOT NULL,
    KEY(event_id),
    KEY(person_id)
);
INSERT INTO invitees (event_id, person_id)
    SELECT event_id, person_id
    FROM invitations WHERE year=2013
    GROUP BY event_id, person_id;
SELECT i.event_id, COUNT(DISTINCT i.person_id)
    FROM attendees AS a
    INNER JOIN invitees AS i
        ON a.person_id = i.person_id AND a.event_id = i.event_id
    GROUP BY i.event_id;

这两个临时表中的每一个都有不到2000行，但是这个最终查询在我的笔记本电脑上花费了2.5秒。我不明白这怎么可能。

另一方面，使用下面的实现，最后的查询只需要0.05秒，即使它访问了完整的邀请表(~100,000行):

CREATE TEMPORARY TABLE attendees (
    event_id SMALLINT(5) UNSIGNED,
    person_id INT(10) UNSIGNED NOT NULL,
    KEY(event_id),
    KEY(person_id)
);
INSERT INTO attendees (event_id, person_id) 
    SELECT event_id, person_id
    FROM attendance WHERE year=2013
    GROUP BY event_id, person_id;
SELECT i.event_id, COUNT(DISTINCT i.person_id)
    FROM attendees AS a
    INNER JOIN invitations AS i
        ON a.person_id = i.person_id AND a.event_id = i.event_id
    WHERE i.year=2013
    GROUP BY i.event_id;

值得注意的是，原始表(出席率和邀请)都有event_id, person_id和year的索引。我之所以要写这么复杂的代码，首先是因为有些人参加了没有被邀请的活动，我必须计算这些人的维恩图的每个部分的计数(出席并被邀请，出席未被邀请，被邀请未参加，以及两者都没有)。

我想我的问题是，这里发生了什么让第二个版本这么快?

我的服务器版本是5.5.36 MySQL社区服务器(5.6有几个奇怪的行为破坏了我的网站)。

每个表只能使用一个索引。与其在person_id和event_id上分别建立索引，不如在这两个表上建立复合索引:

CREATE TEMPORARY TABLE attendees (
    event_id SMALLINT(5) UNSIGNED,
    person_id INT(10) UNSIGNED NOT NULL,
    KEY(event_id, person_id)
);
CREATE TEMPORARY TABLE invitees (
    event_id SMALLINT(5) UNSIGNED,
    person_id INT(10) UNSIGNED NOT NULL,
    KEY(event_id, person_id)
);

我怀疑原来的attendance表有一个这样的索引，这使得与该表的连接要快得多。

相关内容

最新更新

热门标签：