首先,表
调查
CREATE TABLE `surveys` (
`survey_id` int(11) NOT NULL AUTO_INCREMENT,
`survey_name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`survey_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO `surveys` (`survey_id`, `survey_name`) VALUES
(1, 's1'),
(2, 's2');
调查响应
CREATE TABLE `survey_responses` (
`sr_id` int(10) NOT NULL AUTO_INCREMENT,
`sr_text` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`sr_tq_id` int(10) NOT NULL,
`sr_st_id` int(10) NOT NULL,
`sr_su_uid` int(10) NOT NULL,
PRIMARY KEY (`sr_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO `survey_responses` (`sr_id`, `sr_text`, `sr_tq_id`, `sr_st_id`, `sr_su_uid`) VALUES
(1, 'a', 3, 2, 3),
(2, 'b', 4, 2, 3);
Survey_topics
CREATE TABLE `survey_topics` (
`st_id` int(10) NOT NULL AUTO_INCREMENT,
`st_name` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`st_survey_id` int(10) NOT NULL,
PRIMARY KEY (`st_id`)
) ENGINE=InnoDB AUTO_INCREMENT=7 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO `survey_topics` (`st_id`, `st_name`, `st_survey_id`) VALUES
(1, 't1', 1),
(2, 't2', 1),
(3, 't3', 1),
(4, 't4', 2),
(5, 't5', 2),
(6, 't6', 2);
Survey_users
CREATE TABLE `survey_users` (
`su_id` int(10) NOT NULL AUTO_INCREMENT,
`su_s_id` int(10) NOT NULL,
`su_uid` int(10) NOT NULL,
PRIMARY KEY (`su_id`)
) ENGINE=InnoDB AUTO_INCREMENT=4 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO `survey_users` (`su_id`, `su_s_id`, `su_uid`) VALUES
(1, 1, 1),
(2, 1, 2),
(3, 2, 2);
主题问题
CREATE TABLE `topic_questions` (
`tq_id` int(11) NOT NULL AUTO_INCREMENT,
`tq_text` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
`tq_st_id` int(10) NOT NULL,
PRIMARY KEY (`tq_id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO `topic_questions` (`tq_id`, `tq_text`, `tq_st_id`) VALUES
(1, 'q1', 1),
(2, 'q2', 1),
(3, 'q3', 2),
(4, 'q4', 2);
用户
CREATE TABLE `users` (
`u_id` int(10) NOT NULL AUTO_INCREMENT,
`uname` varchar(100) COLLATE utf8_unicode_ci NOT NULL,
PRIMARY KEY (`u_id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci
INSERT INTO `users` (`u_id`, `uname`) VALUES
(1, 'Bob'),
(2, 'Gary');
现在的查询,
SELECT * FROM `surveys`
INNER JOIN survey_users ON survey_users.su_s_id = surveys.survey_id
INNER JOIN survey_topics ON survey_topics.st_survey_id = surveys.survey_id
INNER JOIN topic_questions ON survey_topics.st_id = topic_questions.tq_st_id
WHERE NOT EXISTS (
SELECT * FROM survey_responses a WHERE a.sr_tq_id = topic_questions.tq_id AND a.sr_st_id = survey_topics.st_id AND a.sr_su_uid = survey_users.su_uid
)
基本上,我试图获得一份调查列表,即使有一个用户错过了一个调查回复,即使这只是针对一个主题的一个问题。
该查询运行良好,但在包含数千个调查、主题、用户、问题和回复的数据库中运行速度极慢。
请有人提供一个更快的查询,我读到使用左联接而不是WHERE NOT EXISTS更有效吗?
感谢大家的帮助,我在实际数据库中有索引,这个例子没有任何索引。
谢谢。
考虑NOT IN与NOT EXISTS与LEFT JOIN/IS NULL的其他变体,其中性能可能有所不同:
LEFT JOIN/NULL
...
LEFT JOIN survey_responses sr
ON sr.sr_tq_id = topic_questions.tq_id
AND sr.sr_st_id = survey_topics.st_id
AND sr.sr_su_uid = survey_users.su_uid
WHERE sr.sr_tq_id IS NULL
AND sr.sr_st_id IS NULL
AND sr.sr_su_uid IS NULL
NOT IN(MySQL似乎支持多列IN(
...
WHERE (topic_questions.tq_id, survey_topics.st_id, survey_users.su_uid)
NOT IN (
SELECT a.sr_tq_id, a.sr_st_id, a.sr_su_uid
FROM survey_responses
)
NOT IN(由于多列而使用CTE(
WITH sub AS (
SELECT a.sr_tq_id, a.sr_st_id, a.sr_su_uid
FROM survey_responses
)
SELECT
...
WHERE topic_questions.tq_id NOT IN (SELECT a.str_tq_id FROM sub)
AND survey_topics.st_id NOT IN (SELECT a.sr_st_id FROM sub)
AND survey_users.su_uid NOT IN (SELECT a.sr_su_uid FROM sub)
也许可以尝试对topic_tquestions表进行LEFT OUTER JOIN,并在该表中包含问题的列。然后,您可以检查这些列中的NULL,以确定用户尚未回答的问题。
此外,您可能需要尝试将INNER联接切换为LEFT。它应该处理得更快。如果您需要筛选出未回答的项目、空用户等,请尝试在WHERE子句中进行筛选。
需要的综合指数:
survey_users: (su_s_id, su_uid)
survey_topics: (st_survey_id, st_id)
topic_questions: (tq_st_id, tq_id)
a: (sr_tq_id, sr_su_uid, sr_st_id)