为什么Postgres的执行计划会根据where条件发生很大的变化

我试图执行相同的SQL，但对where子句具有不同的值。一个查询的处理时间明显长于另一个查询。我还观察到这两个查询的执行计划也不同，

Query1 and Execution Plan:

explain analyze
select t."postal_code"
from dev."postal_master" t
left join dev."premise_master" f
on t."primary_code" = f."primary_code"
and t."name" = f."name"
and t."final_code" = f."final_code"
where 1 = 1  and t."region" = 'US' 
and t."name" = 'UBQ' 
and t."accountModCode" = 'LTI' 
and t."modularity_code" = 'PHA'
group by t."postal_code", t."modularity_code", t."region",
t."feature", t."granularity"
Group  (cost=4.19..4.19 rows=1 width=38) (actual time=76411.456..76414.348 rows=11871 loops=1)
Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
->  Sort  (cost=4.19..4.19 rows=1 width=38) (actual time=76411.452..76412.045 rows=11879 loops=1)
Sort Key: t."postal_code", t."feature", t.granularity
Sort Method: quicksort  Memory: 2055kB
->  Nested Loop Left Join  (cost=0.17..4.19 rows=1 width=38) (actual time=45.373..76362.219 rows=11879 loops=1)
Join Filter: (((t."name")::text = (f."name")::text) AND ((t."primary_code")::text = (f."primary_code")::text) AND ((t."final_code")::text = (f."final_code")::text))
Rows Removed by Join Filter: 150642887
->  Index Scan using idx_postal_code_source on postal_master t  (cost=0.09..2.09 rows=1 width=72) (actual time=36.652..154.339 rows=11871 loops=1)
Index Cond: (("name")::text = 'UBQ'::text)
Filter: ((("region")::text = 'US'::text) AND (("accountModCode")::text = 'LTI'::text) AND (("modularity_code")::text = 'PHA'::text))
Rows Removed by Filter: 550164
->  Index Scan using idx_postal_master_source on premise_master f  (cost=0.08..2.09 rows=1 width=35) (actual time=0.016..3.720 rows=12690 loops=11871)
Index Cond: (("name")::text = 'UBQ'::text)
Planning Time: 1.196 ms
Execution Time: 76415.004 ms

Query2和执行计划:

explain analyze
select t."postal_code"
from dev."postal_master" t
left join dev."premise_master" f
on t."primary_code" = f."primary_code"
and t."name" = f."name"
and t."final_code" = f."final_code"
where 1 = 1  and t."region" = 'DE' 
and t."name" = 'EME' 
and t."accountModCode" = 'QEW' 
and t."modularity_code" = 'NFX'
group by t."postal_code", t."modularity_code", t."region",
t."feature", t."granularity"
Group  (cost=50302.96..50426.04 rows=1330 width=38) (actual time=170.687..184.772 rows=8230 loops=1)
Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
->  Gather Merge  (cost=50302.96..50423.27 rows=1108 width=38) (actual time=170.684..182.965 rows=8230 loops=1)
Workers Planned: 2
Workers Launched: 2
->  Group  (cost=49302.95..49304.62 rows=554 width=38) (actual time=164.446..165.613 rows=2743 loops=3)
Group Key: t."postal_code", t."modularity_code", t."region", t."feature", t.granularity
->  Sort  (cost=49302.95..49303.23 rows=554 width=38) (actual time=164.444..164.645 rows=3432 loops=3)
Sort Key: t."postal_code", t."feature", t.granularity
Sort Method: quicksort  Memory: 550kB
Worker 0:  Sort Method: quicksort  Memory: 318kB
Worker 1:  Sort Method: quicksort  Memory: 322kB
->  Nested Loop Left Join  (cost=1036.17..49297.90 rows=554 width=38) (actual time=2.143..148.372 rows=3432 loops=3)
->  Parallel Bitmap Heap Scan on territory_postal_mapping t  (cost=1018.37..38323.78 rows=554 width=72) (actual time=1.898..11.849 rows=2743 loops=3)
Recheck Cond: ((("accountModCode")::text = 'QEW'::text) AND (("region")::text = 'DE'::text) AND (("name")::text = 'EME'::text))
Filter: (("modularity_code")::text = 'NFX'::text)
Rows Removed by Filter: 5914
Heap Blocks: exact=2346
->  Bitmap Index Scan on territorypostal__source_region_mod  (cost=0.00..1018.31 rows=48088 width=0) (actual time=4.783..4.783 rows=25973 loops=1)
Index Cond: ((("accountModCode")::text = 'QEW'::text) AND (("region")::text = 'DE'::text) AND (("name")::text = 'EME'::text))
->  Bitmap Heap Scan on premise_master f  (cost=17.80..19.81 rows=1 width=35) (actual time=0.047..0.048 rows=1 loops=8230)
Recheck Cond: (((t."primary_code")::text = ("primary_code")::text) AND ((t."final_code")::text = ("final_code")::text))
Filter: ((("name")::text = 'EME'::text) AND ((t."name")::text = ("name")::text))
Heap Blocks: exact=1955
->  BitmapAnd  (cost=17.80..17.80 rows=1 width=0) (actual time=0.046..0.046 rows=0 loops=8230)
->  Bitmap Index Scan on premise_master__accountprimarypostal  (cost=0.00..1.95 rows=105 width=0) (actual time=0.008..0.008 rows=24 loops=8230)
Index Cond: ((t."primary_code")::text = ("primary_code")::text)
->  Bitmap Index Scan on premise_master__accountfinalterritorycode  (cost=0.00..15.80 rows=1403 width=0) (actual time=0.065..0.065 rows=559 loops=4568)
Index Cond: ((t."final_code")::text = ("final_code")::text)
Planning Time: 1.198 ms
Execution Time: 185.197 ms

我知道根据where条件会有不同的行数，但这是不同执行计划的唯一原因。还有，我怎样才能提高第一个查询的性能。

对于第一个查询的估计是完全错误的，所以PostgreSQL选择一个糟糕的计划也就不足为奇了。一个接一个地尝试这些方法，看看它们是否有用:

收集统计信息:
```
ANALYZE premise_master, postal_master;
```

计算更精确的统计数据:

ALTER TABLE premise_master ALTER name SET statistics 1000;
ALTER TABLE postal_master ALTER name SET statistics 1000;
ANALYZE premise_master, postal_master;

第一个查询中的估计值以如此糟糕的方式偏离，我怀疑存在异常问题，例如升级pg_upgrade之后忘记运行ANALYZE，或者您正在使用pg_stat_reset()擦除数据库统计信息。

如果不是这种情况，并且表的一个简单的ANALYZE就可以解决问题，那么问题的原因一定是自动分析在这些表上运行得不够频繁。您可以通过以下语句来调整autovacuum，使其更频繁地执行此操作:

ALTER TABLE premise_master SET (autovacuum_analyze_scale_factor = 0.01);

这将使PostgreSQL收集统计每当1%的表被改变。

每个EXPLAIN ANALYZE输出的第一行表明，计划器只期望第一个查询得到1行，而期望第二个查询得到1130行，因此这可能是它选择效率较低的查询计划的原因。这通常意味着表统计信息不是最新的，并且当它们最后一次运行时，没有多少行与第一次查询匹配(也许数据是按字母顺序加载的?)在这种情况下，修复方法是执行ANALYZE dev."postal_master"查询来刷新统计信息。

您也可以尝试完全删除GROUP BY子句(如果您的工具允许)。我可能读错了，但看起来它对输出影响不大。如果这导致不必要的重复，您可以使用select distinct t.postal_code来代替组by。

相关内容

最新更新

热门标签：