如何使用内部联接优化查询



我的mysql查询太慢,我不知道如何优化它。我的webapp无法加载这个查询,因为运行时间太长,而web服务器获取结果的时间有限。

SELECT rc.trial_id,
rc.created,
rc.date_registration,
rc.agemin_value,
rc.agemin_unit,
rc.agemax_value,
rc.agemax_unit,
rc.exclusion_criteria,
rc.study_design,
rc.expanded_access_program,
rc.number_of_arms,
rc.enrollment_start_actual,
rc.target_sample_size,
(select name from repository_institution where id = rc.primary_sponsor_id) as 
primary_sponsor,
(select label from vocabulary_studytype where id = rc.study_type_id) as study_type,
(select label from vocabulary_interventionassigment where id = 
rc.intervention_assignment_id) as intervention_assignment,
(select label from vocabulary_studypurpose where id = rc.purpose_id) as study_purpose,  
(select label from vocabulary_studymasking where id = rc.masking_id) as study_mask,
(select label from vocabulary_studyallocation where id = rc.allocation_id) as 
study_allocation,        
(select label from vocabulary_studyphase where id = rc.phase_id) as phase,
(select label from vocabulary_recruitmentstatus where id = rc.recruitment_status_id) as 
recruitment_status,
GROUP_CONCAT(vi.label) 
FROM
repository_clinicaltrial rc 
inner JOIN repository_clinicaltrial_i_code rcic ON rcic.clinicaltrial_id = rc.id JOIN 
vocabulary_interventioncode vi ON vi.id = rcic.interventioncode_id 
GROUP BY rc.id;

使用内部联接代替联接可能是一个解决方案吗?

将JOIN改为每行连续选择肯定会有所改进。此外,由于您使用的是MySQL,因此使用关键字"STRIGHT_JOIN";告诉MySQL按照我提供的顺序进行查询。自从你的";rc";表是主要的,所有其他的都是查找,这将使MySQL在该上下文中使用它,而不是希望其他一些查找表作为其余联接的基础。

SELECT STRAIGHT_JOIN
rc.trial_id,
rc.created,
rc.date_registration,
rc.agemin_value,
rc.agemin_unit,
rc.agemax_value,
rc.agemax_unit,
rc.exclusion_criteria,
rc.study_design,
rc.expanded_access_program,
rc.number_of_arms,
rc.enrollment_start_actual,
rc.target_sample_size,
ri.name primary_sponsor,
st.label study_type,
via.label intervention_assignment,
vsp.label study_purpose,
vsm.label study_mask,
vsa.label study_allocation,
vsph.label phase,
vrs.label recruitment_status,
GROUP_CONCAT(vi.label) 
FROM
repository_clinicaltrial rc 
JOIN repository_clinicaltrial_i_code rcic 
ON rc.id = rcic.clinicaltrial_id
JOIN vocabulary_interventioncode vi 
ON rcic.interventioncode_id = vi.id
JOIN repository_institution ri
on rc.primary_sponsor_id = ri.id
JOIN vocabulary_studytype st
on rc.study_type_id = st.id
JOIN vocabulary_interventionassigment via 
on rc.intervention_assignment_id = via.id
JOIN vocabulary_studypurpose vsp 
ON rc.purpose_id = vsp.id
JOIN vocabulary_studymasking vsm 
ON rc.masking_id = vsm.id
JOIN vocabulary_studyallocation vsa 
ON rc.allocation_id = vsa.id
JOIN vocabulary_studyphase vsph
ON rc.phase_id = vsph.id
JOIN vocabulary_recruitmentstatus vrs 
ON rc.recruitment_status_id = vrs.id 
GROUP BY 
rc.id;

最后一点。您正在使用GROUP BY并应用于GROUP_CONCT((,这是可以的。但是,正确的groupby表示您需要按所有非聚合列进行分组,在本例中,这是列表中的其他列。您可能知道这一点,并且基于";rc";关联列,但这样做不是很好的做法。

您的联接和子查询可能不是问题所在。假设表上有正确的索引,那么这些索引很快"正确的索引";意味着id列就是primary key——这是一个非常合理的假设。

我的猜测是GROUP BY是性能问题。因此,我建议在构建查询时不要使用`GROUP BY:

select . . .
(select group_concat(vi.label)
from repository_clinicaltrial_i_code rcic 
vocabulary_interventioncode vi 
on vi.id = rcic.interventioncode_id 
where rcic.clinicaltrial_id = rc.id
)
from repository_clinicaltrial rc ;

为此,您需要在以下位置建立索引:

  • repository_clinicaltrial_i_code(clinicaltrial_id, interventioncode_id)
  • vocabulary_interventioncode(id, label)

最新更新