在postgresql中添加了另一个过滤器后,我有一个函数完全失去了它的性能
下面是一个简单的例子,展示了它最初的良好性能。
CREATE OR REPLACE FUNCTION my_function(param_a boolean, param_b boolean )
RETURNS TABLE(blablabla)
LANGUAGE sql
IMMUTABLE
AS $function$
with data as (
select id,amount,account_nr from transfer
)
select * from
data d
where param_a or 0.00 <> (select sum(d2.amount)
from data d2
where d2.id = d.id)
$function$
;
(cost=0.25..10.25 rows=1000 width=560)
(actual time=1162.528..1162.561 rows=306 loops=1)
Buffers: shared hit=1099180
Planning time: 2.928 ms
Execution time: 1162.630 ms
在我添加了另一个带有子选择和计数的过滤器后,我失去了性能。这个计数对性能有影响吗?我能用另一种方法解决它吗?
CREATE OR REPLACE FUNCTION my_function(param_a boolean, param_b boolean )
RETURNS TABLE(blablabla)
LANGUAGE sql
IMMUTABLE
AS $function$
with data as (
select id,amount,account_nr from transfer
)
select * from
data d
where (param_b or 1 < (select count(d2.account_nr)
from data d2
where d2.id = d.id
group by d2.account_nr))
and (param_a or 0.00 <> (select sum(d2.amount)
from data d2
where d2.id = d.id))
$function$
;
(cost=0.25..10.25 rows=1000 width=560)
(actualtime=271191.341..271191.383 rows=306 loops=1)
Buffers: shared hit=1099180
Planning time: 2.955 ms
Execution time: 271191.463 ms
嵌入存储函数的慢速查询是:
with data as ( -- original query from the question.
select id,amount,account_nr from transfer
)
select *
from data d
where (param_b or 1 < (select count(d2.account_nr)
from data d2
where d2.id = d.id
group by d2.account_nr)
)
and (param_a or 0.00 <> (select sum(d2.amount)
from data d2
where d2.id = d.id)
)
这有一个毫无意义的公共表表达式。为了简单起见,我们可以去掉它。如果你出于其他目的需要它,你可以随时把它放回去。
并且有几个相关的子查询。让我们将它们重构为一个独立的子查询。从独立的子查询开始:
select id,
count(account_nr) account_count,
sum(amount) total_amount
from transfer
group by id
此聚合子查询生成transfer
表中每个id
的帐户数和总金额。眼球的结果说服自己它做了你需要它做的事。
然后,我们可以将其连接到主查询中,并应用您的WHERE条件。
select d.id, d.amount, d.account_nr
from transfer d
join (
select id,
count(account_nr) account_count,
sum(amount) total_amount
from transfer
group by id
) d2 ON d.id = d2.id
where (param_b or 1 < d2.account_count)
and (param_a or 0.00 <> d2.total_amount)
使用独立的子查询可以大大加快速度;有时,查询规划器决定需要多次重新评估依赖子查询。
以下索引将帮助子查询更快地运行。
CREATE INDEX id_details ON transfer (id) INCLUDE (account_nr, amount);
说服自己这是有效的,而且足够快。(我没有调试它,因为我没有你的数据。(你需要用true
和false
代替param_a
和param_b
来测试它。
然后,也只有到那时,将它放入存储的函数中。