我需要计算表的统计信息。我写了下一个SQL:
SELECT
DISTINCT ("region"),
COUNT(*) as total,
COUNT(*) FILTER(WHERE t1.insert_status = 'success' AND t1.insert_status = 'success') as completed,
COUNT(*) FILTER(WHERE t1.insert_status IS NULL AND t1.insert_status IS NULL ) as waiting,
COUNT(*) FILTER(WHERE t1.insert_status = 'failed' ) as insert_failed
FROM xml_files t1
WHERE t1.section_name='payments'
AND processed_date >='2010-07-28' AND processed_date <='2021-08-28'
group by region
我的表太大(5000万行),花了太多时间来获得结果的问题。有什么方法可以优化这个查询吗?
哪些优化方法是可能的?
首先,您几乎不需要select distinct
和group by
。第二,你的查询有语法错误。
假设你想:
SELECT region,
COUNT(*) as total,
COUNT(*) FILTER (WHERE t1.insert_status = 'success' AND t1.insert_status = 'success') as completed,
COUNT(*) FILTER (WHERE t1.insert_status IS NULL AND t1.insert_status IS NULL ) as waiting,
COUNT(*) FILTER (WHERE t1.insert_status = 'failed' ) as insert_failed
FROM xml_files t1
WHERE t1.section_name = 'payments' AND
t1.processed_date >='2010-07-28' AND t1.processed_date <='2021-08-28'
GROUP BY region;
那么你想要一个xml_files(second_name, processed_date, region, insert_status)
的索引。