为什么在子查询中使用OR会使查询变慢?



我正在使用MySQL,并有以下查询,我试图改进:

SELECT
*
FROM
overpayments AS op
JOIN payment_allocations AS overpayment_pa ON overpayment_pa.allocatable_id = op.id
AND overpayment_pa.allocatable_type = 'Overpayment'
JOIN (
SELECT
pa.payment_source_type,
pa.payment_source_id,
ft.conversion_rate
FROM
payment_allocations AS pa
LEFT JOIN line_items AS li ON pa.payment_source_id = li.id
LEFT JOIN credit_notes AS cn ON li.parent_document_id = cn.id
LEFT JOIN financial_transactions AS ft ON (
ft.commercial_document_id = pa.payment_source_id
AND ft.commercial_document_type = pa.payment_source_type
)
OR (
ft.commercial_document_id = cn.id
AND ft.commercial_document_type = 'CreditNote'
)
WHERE
pa.allocatable_type = 'Overpayment'
AND pa.company_id = 14792
AND ft.company_id = 14792
) AS op_bank_transaction_ft ON op_bank_transaction_ft.payment_source_id = overpayment_pa.payment_source_id
AND op_bank_transaction_ft.payment_source_type = overpayment_pa.payment_source_type;

运行时间为10s。通过删除子查询中的OR语句并使用COALESCE来获得结果,我能够将其改进到0.047s:

SELECT
*
FROM
overpayments AS op
JOIN payment_allocations AS overpayment_pa ON overpayment_pa.allocatable_id = op.id
AND overpayment_pa.allocatable_type = 'Overpayment'
JOIN (
SELECT
pa.payment_source_type,
pa.payment_source_id,
coalesce(ft_one.conversion_rate, ft_two.conversion_rate)
FROM
payment_allocations AS pa
LEFT JOIN line_items AS li ON pa.payment_source_id = li.id
LEFT JOIN credit_notes AS cn ON li.parent_document_id = cn.id
LEFT JOIN financial_transactions AS ft_one ON (
ft_one.commercial_document_id = pa.payment_source_id
AND ft_one.commercial_document_type = pa.payment_source_type
AND ft_one.company_id = 14792
)
LEFT JOIN financial_transactions AS ft_two ON (
ft_two.commercial_document_id = cn.id
AND ft_two.commercial_document_type = 'CreditNote'
AND ft_two.company_id = 14792
)
WHERE
pa.allocatable_type = 'Overpayment'
AND pa.company_id = 14792

) AS op_bank_transaction_ft ON op_bank_transaction_ft.payment_source_id = overpayment_pa.payment_source_id
AND op_bank_transaction_ft.payment_source_type = overpayment_pa.payment_source_type;

然而,我真的不明白为什么这有效?最初的子查询运行得非常快,只返回2个结果,那么为什么它会使查询速度降低这么多呢?解释第一个查询返回以下内容:

<表类># idselect_type表分区类型possible_keys键key_lenref行过滤额外FIELD13tbody><<tr>1简单parefindex_payment_allocations_on_payment_source_idindex_payment_allocations_on_company_idindex_payment_allocations_on_company_id5const19110.00使用在1简单overpayment_parefindex_payment_allocations_on_payment_source_idindex_payment_allocations_on_allocatable_idindex_payment_allocations_on_payment_source_id5rails.pa.payment_source_id13.42使用在1简单opeq_ref主要主要4rails.overpayment_pa.allocatable_id1100.001简单李eq_ref主要主要4rails.pa.payment_source_id1100.001简单cneq_ref主要主要8rails.li.parent_document_id1100.00使用的地方;使用指数1简单《金融时报》所有transactions_unique_by_commercial_doc125878780.00范围检查每条记录(指数映射:0 x2)

查看第一个EXPLAIN的最后一行的右侧。它没有使用索引,而且它必须扫描兆字节。这是缓慢的。第二个查询对查询的每一步都使用了索引,所以它要快得多。

如果你的第二个查询产生正确的结果,使用它,不要回头看。恭喜你!你已经优化了一个查询。

OR操作,特别是ON子句中的

OR操作,对于查询规划器模块来说比通常更难满足,因为它们通常意味着它必须取两个独立子查询的并集。看来在你的案子里,策划者选择了暴力手段。(蛮力===扫描多行)

如果不了解索引,就很难进一步帮助您。

阅读本文了解更多信息。https://use-the-index-luke.com

这些可能会进一步加速第二个公式:

overpayment_pa:
INDEX(payment_source_id, payment_source_type, allocatable_type, allocatable_id)
pa: INDEX(allocatable_type, company_id, payment_source_id,  payment_source_type)
financial_transactions:
INDEX(commercial_document_id, commercial_document_type, company_id, conversion_rate)

最新更新