在Postgresql查询的过滤条件中,对列的字符串操作如何选择计划



我正在优化查询,与愚蠢的运气,我尝试了一些东西,它改善了查询,但我无法解释为什么。下面是性能较差的查询

with ctedata1 as(
select
sum(total_visit_count) as total_visit_count,
sum(sh_visit_count) as sh_visit_count,
sum(ec_visit_count) as ec_visit_count,
sum(total_like_count) as total_like_count,
sum(sh_like_count) as sh_like_count,
sum(ec_like_count) as ec_like_count,
sum(total_order_count) as total_order_count,
sum(sh_order_count) as sh_order_count,
sum(ec_order_count) as ec_order_count,
sum(total_sales_amount) as total_sales_amount,
sum(sh_sales_amount) as sh_sales_amount,
sum(ec_sales_amount) as ec_sales_amount,
sum(ec_order_online_count) as ec_order_online_count,
sum(ec_sales_online_amount) as ec_sales_online_amount,
sum(ec_order_in_store_count) as ec_order_in_store_count,
sum(ec_sales_in_store_amount) as ec_sales_in_store_amount,
table2.im_name,
table2.brand as kpibrand,
table2.id_region as kpiregion
from
table2
where
deleted_at is null
and id_region = any('{1}')
group by
im_name,
kpiregion,
kpibrand ),
ctedata2 as (
select
ctedata1.*,
rank() over (partition by (kpiregion,
kpibrand)
order by
coalesce(ctedata1.total_sales_amount, 0) desc) rank,
count(*) over (partition by (kpiregion,
kpibrand)) as total_count
from
ctedata1 )
select
table1.id_pf_item,
table1.product_id,
table1.color_code,
table1.l1_code,
table1.local_title as product_name,
table1.id_region,
table1.gender,
case
when table1.created_at is null then '1970/01/01 00:00:00'
else table1.created_at
end as created_at,
(
select
count(distinct id_outfit)
from
table3
left join table4 on
table3.id_item = table4.id_item
and table4.deleted_at is null
where
table3.deleted_at is null
and table3.id_pf_item = table1.id_pf_item) as outfit_count,
count(*) over() as total_matched,
case
when table1.v8_im_name = '' then table1.im_name
else table1.v8_im_name
end as im_name,
case
when table1.id_region != 1 then null
else
case
when table1.sales_start_at is null then '1970/01/01 00:00:00'
else table1.sales_start_at
end
end as sales_start_date,
table1.category_ids,
array_to_string(table1.intermediate_category_ids, ','),
table1.image_url,
table1.brand,
table1.pdp_url,
coalesce(ctedata2.total_visit_count, 0) as total_visit_count,
coalesce(ctedata2.sh_visit_count, 0) as sh_visit_count,
coalesce(ctedata2.ec_visit_count, 0) as ec_visit_count,
coalesce(ctedata2.total_like_count, 0) as total_like_count,
coalesce(ctedata2.sh_like_count, 0) as sh_like_count,
coalesce(ctedata2.ec_like_count, 0) as ec_like_count,
coalesce(ctedata2.total_order_count, 0) as total_order_count,
coalesce(ctedata2.sh_order_count, 0) as sh_order_count,
coalesce(ctedata2.ec_order_count, 0) as ec_order_count,
coalesce(ctedata2.total_sales_amount, 0) as total_sales_amount,
coalesce(ctedata2.sh_sales_amount, 0) as sh_sales_amount,
coalesce(ctedata2.ec_sales_amount, 0) as ec_sales_amount,
coalesce(ctedata2.ec_order_online_count, 0) as ec_order_online_count,
coalesce(ctedata2.ec_sales_online_amount, 0) as ec_sales_online_amount,
coalesce(ctedata2.ec_order_in_store_count, 0) as ec_order_in_store_count,
coalesce(ctedata2.ec_sales_in_store_amount, 0) as ec_sales_in_store_amount,
ctedata2.rank,
ctedata2.total_count,
table1.department,
table1.seasons
from
table1
left join ctedata2 on
table1.im_name = ctedata2.im_name
and table1.brand = ctedata2.kpibrand
where
table1.deleted_at is null
and table1.id_region = any('{1}')
and lower(table1.brand) = any('{"brand1","brand2"}')
and 'season1' = any(lower(seasons::text)::text[])
and table1.department = 'Department1'
order by
total_sales_amount desc offset 0
limit 100

上述查询的解释输出为

QUERY PLAN                                                                                                                                                                                                                                     
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=172326.55..173435.38 rows=1 width=952) (actual time=85664.201..85665.970 rows=100 loops=1)                                                                                                                                        
CTE ctedata1                                                                                                                                                                                                                                  
->  GroupAggregate  (cost=0.42..80478.71 rows=43468 width=530) (actual time=0.063..708.069 rows=73121 loops=1)                                                                                                                             
Group Key: table2.im_name, table2.id_region, table2.brand                                                                                                                                                    
->  Index Scan using udx_table2_im_name_id_region_brand_target_date_key on table2  (cost=0.42..59699.18 rows=391708 width=146) (actual time=0.029..308.582 rows=391779 loops=1)                                      
Filter: ((deleted_at IS NULL) AND (id_region = ANY ('{1}'::integer[])))                                                                                                                                                        
Rows Removed by Filter: 20415                                                                                                                                                                                                  
CTE ctedata2                                                                                                                                                                                                                             
->  WindowAgg  (cost=16104.06..17842.78 rows=43468 width=628) (actual time=1012.994..1082.057 rows=73121 loops=1)                                                                                                                          
->  WindowAgg  (cost=16104.06..17082.09 rows=43468 width=620) (actual time=945.755..1014.656 rows=73121 loops=1)                                                                                                                     
->  Sort  (cost=16104.06..16212.73 rows=43468 width=612) (actual time=945.747..963.254 rows=73121 loops=1)                                                                                                                     
Sort Key: ctedata1.kpiregion, ctedata1.kpibrand, (COALESCE(ctedata1.total_sales_amount, '0'::numeric)) DESC                                                                                                                 
Sort Method: external merge  Disk: 6536kB                                                                                                                                                                                
->  CTE Scan on ctedata1  (cost=0.00..869.36 rows=43468 width=612) (actual time=0.069..824.841 rows=73121 loops=1)                                                                                                        
->  Result  (cost=74005.05..75113.88 rows=1 width=952) (actual time=85664.199..85665.950 rows=100 loops=1)                                                                                                                                   
->  Sort  (cost=74005.05..74005.05 rows=1 width=944) (actual time=85664.072..85664.089 rows=100 loops=1)                                                                                                                               
Sort Key: (COALESCE(ctedata2.total_sales_amount, '0'::numeric)) DESC                                                                                                                                                         
Sort Method: top-N heapsort  Memory: 76kB                                                                                                                                                                                        
->  WindowAgg  (cost=10960.95..74005.04 rows=1 width=944) (actual time=85658.049..85661.393 rows=3151 loops=1)                                                                                                                   
->  Nested Loop Left Join  (cost=10960.95..74005.02 rows=1 width=927) (actual time=1075.219..85643.595 rows=3151 loops=1)                                                                                                  
Join Filter: (((table1.im_name)::text = ctedata2.im_name) AND ((table1.brand)::text = ctedata2.kpibrand))                                                                                                  
Rows Removed by Join Filter: 230402986                                                                                                                                                                               
->  Bitmap Heap Scan on table1  (cost=10960.95..72483.64 rows=1 width=399) (actual time=45.466..278.376 rows=3151 loops=1)                                                                                          
Recheck Cond: (id_region = ANY ('{1}'::integer[]))                                                                                                                                                             
Filter: ((deleted_at IS NULL) AND (department = 'Department1'::text) AND (lower((brand)::text) = ANY ('{brand1, brand2}'::text[])) AND ('season1'::text = ANY ((lower((seasons)::text))::text[])))
Rows Removed by Filter: 106335                                                                                                                                                                                 
Heap Blocks: exact=42899                                                                                                                                                                                       
->  Bitmap Index Scan on table1_im_name_id_region_key  (cost=0.00..10960.94 rows=110619 width=0) (actual time=38.307..38.307 rows=109486 loops=1)                                                             
Index Cond: (id_region = ANY ('{1}'::integer[]))                                                                                                                                                         
->  CTE Scan on ctedata2  (cost=0.00..869.36 rows=43468 width=592) (actual time=0.325..21.721 rows=73121 loops=3151)                                                                                             
SubPlan 3                                                                                                                                                                                                                              
->  Aggregate  (cost=1108.80..1108.81 rows=1 width=8) (actual time=0.018..0.018 rows=1 loops=100)                                                                                                                                    
->  Nested Loop Left Join  (cost=5.57..1108.57 rows=93 width=4) (actual time=0.007..0.016 rows=3 loops=100)                                                                                                                    
->  Bitmap Heap Scan on table3  (cost=5.15..350.95 rows=93 width=4) (actual time=0.005..0.008 rows=3 loops=100)                                                                                                            
Recheck Cond: (id_pf_item = table1.id_pf_item)                                                                                                                                                                    
Filter: (deleted_at IS NULL)                                                                                                                                                                                       
Heap Blocks: exact=107                                                                                                                                                                                             
->  Bitmap Index Scan on idx_id_pf_item  (cost=0.00..5.12 rows=93 width=0) (actual time=0.003..0.003 rows=3 loops=100)                                                                                             
Index Cond: (id_pf_item = table1.id_pf_item)                                                                                                                                                                
->  Index Scan using index_table4_id_item on table4  (cost=0.42..8.14 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=303)                                                                        
Index Cond: (table3.id_item = id_item)                                                                                                                                                                               
Filter: (deleted_at IS NULL)                                                                                                                                                                                       
Rows Removed by Filter: 0                                                                                                                                                                                          
Planning time: 1.023 ms                                                                                                                                                                                                                        
Execution time: 85669.512 ms    

我改变

and lower(table1.brand) = any('{"brand1","brand2"}')
查询

中的

and table1.brand = any('{"Brand1","Brand2"}')

和计划更改为

QUERY PLAN                                                                                                                                                                                                                                    
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=173137.44..188661.06 rows=14 width=952) (actual time=1444.123..1445.653 rows=100 loops=1)                                                                                                                                        
CTE ctedata1                                                                                                                                                                                                                                 
->  GroupAggregate  (cost=0.42..80478.71 rows=43468 width=530) (actual time=0.040..769.982 rows=73121 loops=1)                                                                                                                            
Group Key: table2.im_name, table2.id_region, table2.brand                                                                                                                                                   
->  Index Scan using udx_table2_item_im_name_id_region_brand_target_date_key on table2  (cost=0.42..59699.18 rows=391708 width=146) (actual time=0.021..350.774 rows=391779 loops=1)                                     
Filter: ((deleted_at IS NULL) AND (id_region = ANY ('{1}'::integer[])))                                                                                                                                                       
Rows Removed by Filter: 20415                                                                                                                                                                                                 
CTE ctedata2                                                                                                                                                                                                                            
->  WindowAgg  (cost=16104.06..17842.78 rows=43468 width=628) (actual time=1088.905..1153.749 rows=73121 loops=1)                                                                                                                         
->  WindowAgg  (cost=16104.06..17082.09 rows=43468 width=620) (actual time=1020.017..1089.117 rows=73121 loops=1)                                                                                                                   
->  Sort  (cost=16104.06..16212.73 rows=43468 width=612) (actual time=1020.011..1037.170 rows=73121 loops=1)                                                                                                                  
Sort Key: ctedata1.kpiregion, ctedata1.kpibrand, (COALESCE(ctedata1.total_sales_amount, '0'::numeric)) DESC                                                                                                                
Sort Method: external merge  Disk: 6536kB                                                                                                                                                                               
->  CTE Scan on ctedata1  (cost=0.00..869.36 rows=43468 width=612) (actual time=0.044..891.653 rows=73121 loops=1)                                                                                                       
->  Result  (cost=74815.94..90339.56 rows=14 width=952) (actual time=1444.121..1445.635 rows=100 loops=1)                                                                                                                                   
->  Sort  (cost=74815.94..74815.98 rows=14 width=944) (actual time=1444.053..1444.065 rows=100 loops=1)                                                                                                                               
Sort Key: (COALESCE(ctedata2.total_sales_amount, '0'::numeric)) DESC                                                                                                                                                        
Sort Method: top-N heapsort  Memory: 76kB                                                                                                                                                                                       
->  WindowAgg  (cost=72207.31..74815.68 rows=14 width=944) (actual time=1439.128..1441.885 rows=3151 loops=1)                                                                                                                   
->  Hash Right Join  (cost=72207.31..74815.40 rows=14 width=927) (actual time=1307.531..1437.246 rows=3151 loops=1)                                                                                                       
Hash Cond: ((ctedata2.im_name = (table1.im_name)::text) AND (ctedata2.kpibrand = (table1.brand)::text))                                                                                                   
->  CTE Scan on ctedata2  (cost=0.00..869.36 rows=43468 width=592) (actual time=1088.911..1209.646 rows=73121 loops=1)                                                                                          
->  Hash  (cost=72207.10..72207.10 rows=14 width=399) (actual time=216.850..216.850 rows=3151 loops=1)                                                                                                              
Buckets: 4096 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 1249kB                                                                                                                              
->  Bitmap Heap Scan on table1  (cost=10960.95..72207.10 rows=14 width=399) (actual time=46.434..214.246 rows=3151 loops=1)                                                                                  
Recheck Cond: (id_region = ANY ('{1}'::integer[]))                                                                                                                                                      
Filter: ((deleted_at IS NULL) AND (department = 'Department1'::text) AND ((brand)::text = ANY ('{Brand1, Brand2}'::text[])) AND ('season1'::text = ANY ((lower((seasons)::text))::text[])))
Rows Removed by Filter: 106335                                                                                                                                                                          
Heap Blocks: exact=42899                                                                                                                                                                                
->  Bitmap Index Scan on table1_im_name_id_region_key  (cost=0.00..10960.94 rows=110619 width=0) (actual time=34.849..34.849 rows=109486 loops=1)                                                      
Index Cond: (id_region = ANY ('{1}'::integer[]))                                                                                                                                                  
SubPlan 3                                                                                                                                                                                                                             
->  Aggregate  (cost=1108.80..1108.81 rows=1 width=8) (actual time=0.015..0.015 rows=1 loops=100)                                                                                                                                   
->  Nested Loop Left Join  (cost=5.57..1108.57 rows=93 width=4) (actual time=0.006..0.014 rows=3 loops=100)                                                                                                                   
->  Bitmap Heap Scan on table3  (cost=5.15..350.95 rows=93 width=4) (actual time=0.004..0.006 rows=3 loops=100)                                                                                                           
Recheck Cond: (id_pf_item = table1.id_pf_item)                                                                                                                                                                   
Filter: (deleted_at IS NULL)                                                                                                                                                                                      
Heap Blocks: exact=107                                                                                                                                                                                            
->  Bitmap Index Scan on idx_id_pf_item  (cost=0.00..5.12 rows=93 width=0) (actual time=0.003..0.003 rows=3 loops=100)                                                                                            
Index Cond: (id_pf_item = table1.id_pf_item)                                                                                                                                                               
->  Index Scan using index_table4_id_item on table4  (cost=0.42..8.14 rows=1 width=8) (actual time=0.002..0.002 rows=0 loops=303)                                                                       
Index Cond: (table3.id_item = id_item)                                                                                                                                                                              
Filter: (deleted_at IS NULL)                                                                                                                                                                                      
Rows Removed by Filter: 0                                                                                                                                                                                         
Planning time: 0.760 ms                                                                                                                                                                                                                       
Execution time: 1448.848 ms                                                                                                                                                                                                                                                                                                                                                                                                                                  

我观察

表1左连接ctedata2的连接策略在避免lower()函数后发生变化。策略从嵌套循环左连接改为哈希右连接。

在性能较好的查询中,ctedata2上的CTE Scan节点只执行一次。

<<p>

Postgres版本/strong>9.6请帮我理解这种行为。如果需要,我会提供更多的信息。

深入研究一个几乎过时的版本的内部工作原理几乎是不值得的。这些时间和精力最好花在升级上。

但是问题很明显。您对表1的扫描估计非常糟糕,尽管在更好的计划中糟糕程度降低了14倍。

->  Bitmap Heap Scan on table1  (cost=10960.95..72483.64 rows=1 width=399) (actual time=45.466..278.376 rows=3151 loops=1)
->  Bitmap Heap Scan on table1  (cost=10960.95..72207.10 rows=14 width=399) (actual time=46.434..214.246 rows=3151 loops=1)

你对lower()的使用,显然没有理由,肯定会导致糟糕的估计。动态地将字符串转换为数组当然也没有帮助。如果它首先被存储为一个真正的数组,那么统计系统就可以得到它并生成更合理的估计。

最新更新