我目前正在使用CodeIgniter框架开发一个PHP/MySQL应用程序。
我有一个相当长的查询,导致一些问题。当将日期范围更改为更长的时间段(例如30天)而不是默认的7天时,就会出现问题。查询时间大幅增加:1/2秒到90秒,但我只能假设这是由于数据大小的增加。
在我粘贴查询之前,下面是对这些表的快速解释:
- flagged_cases:唯一案例列表(主表)- 352行
- data_sources:数据源列表,每种情况下使用外键引用该表- 20行
- matches:一个case的文本匹配行(一对多关系,即一个case,多个match) - 22000行
- flagged_cases_keywords_hits:大小写id到关键字(和命中数)的映射- 2500行
- keywords:关键字列表- 121行
- reviewed_state: id/description for 3个状态,只检查这个查询reviewed_state = 1 - 3行
以下是查询,我意识到它相当大,但我认为一定有索引的潜在问题,不幸的是,我只是没有知识来完全排除故障,所以任何帮助都是感激的。
SELECT flagged_cases.id,
data_source_id,
title,
fetch_date,
publish_date,
case_id,
case_title,
case_link,
relevance_score,
(
SELECT group_concat(match_string_highlighted ORDER BY matches.id SEPARATOR "")
FROM matches
WHERE flagged_case_id=flagged_cases.id) AS all_matches,
reviewed_state_id,
(
SELECT group_concat(concat(k.keyword, " ", "x", cast(kh.hits AS CHAR), "") SEPARATOR "")
FROM flagged_cases_keywords_hits kh
JOIN keywords k
ON kh.keyword_id = k.id
WHERE kh.flagged_case_id = flagged_cases.id
ORDER BY k.weighting DESC) AS hitcount
FROM flagged_cases
JOIN data_sources
ON flagged_cases.data_source_id = data_sources.id
JOIN reviewed_state
ON flagged_cases.reviewed_state_id = reviewed_state.id
LEFT JOIN matches
ON flagged_cases.id = matches.flagged_case_id
WHERE reviewed_state_id = 1
AND data_source_id IN('1',
'3',
'4',
'5',
'6',
'7',
'8',
'9',
'10',
'11',
'12',
'13',
'14',
'15',
'16',
'17',
'18',
'19',
'20')
AND fetch_date >= '2015-05-10 00:00:00'
AND fetch_date <= '2015-05-17 23:59:59'
GROUP BY flagged_cases.id
ORDER BY title DESC
LIMIT 10;
作为做SHOW FULL PROCESSLIST的结果,我可以看到查询停留在"发送数据"状态,从一些研究中我可以看到基本上是MySQL获取和选择数据,所以我只能假设一定有一个缺失的索引或导致这个变慢的东西。
我也得到了查询的EXPLAIN,如下所示:
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
| 1 | PRIMARY | reviewed_state | const | PRIMARY | PRIMARY | 4 | const | 1 | Using index; Using temporary; Using filesort |
| 1 | PRIMARY | data_sources | range | PRIMARY | PRIMARY | 4 | NULL | 19 | Using where |
| 1 | PRIMARY | flagged_cases | ref | data_source_id,reviewed_state_id | data_source_id | 4 | proactive.data_sources.id | 14 | Using where |
| 1 | PRIMARY | matches | ref | flagged_case_id | flagged_case_id | 4 | proactive.flagged_cases.id | 32 | Using index |
| 3 | DEPENDENT SUBQUERY | kh | ref | flagged_case_id,keyword_id | flagged_case_id | 5 | func | 3 | Using where; Using temporary |
| 3 | DEPENDENT SUBQUERY | k | eq_ref | PRIMARY | PRIMARY | 4 | proactive.kh.keyword_id | 1 | Using where |
| 2 | DEPENDENT SUBQUERY | matches | ref | flagged_case_id | flagged_case_id | 4 | func | 32 | |
+----+--------------------+----------------+--------+----------------------------------+-----------------+---------+----------------------------+------+----------------------------------------------+
任何帮助/建议/提示非常感谢!:)
您能尝试一下,看看是否有任何好处吗?
选择列表中的子查询将被替换为内联视图,这些视图根据连接到其他表的值分组。
select flagged_cases.id,
data_source_id,
title,
fetch_date,
publish_date,
case_id,
case_title,
case_link,
relevance_score,
v1.all_matches,
reviewed_state_id,
v2.hitcount
from flagged_cases
join data_sources
on flagged_cases.data_source_id = data_sources.id
join reviewed_state
on flagged_cases.reviewed_state_id = reviewed_state.id
join (
select group_concat(match_string_highlighted order by matches.id separator "") as all_matches
from matches
group by flagged_case_id
) v1
on v1.flagged_case_id = flagged_cases.id
join (
select group_concat(concat(k.keyword, " ", "x", cast(kh.hits as char), "") order by k.weighting desc separator "")
from flagged_cases_keywords_hits kh
join keywords k
on kh.keyword_id = k.id
group by kh.flagged_case_id
) v2
on v2.flagged_case_id = flagged_cases.id
left join matches
on flagged_cases.id = matches.flagged_case_id
where reviewed_state_id = 1
and data_source_id in('1','3','4','5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20')
and fetch_date >= '2015-05-10 00:00:00'
and fetch_date <= '2015-05-17 23:59:59'
group by flagged_cases.id
order by title desc
limit 10;