PostgreSQL fdw表的性能在5次完全相同的查询后呈指数级下降



我有一个postgres 13.3数据库运行在CentOS机器上。在执行这些测试时,此机器上没有运行任何其他内容,并且在执行测试时没有其他内容访问数据库。

jammerdal表包含大约50万行。我对其他包含大约50,000行的表重复了这个实验。结果是一样的,但是速度变慢似乎与所使用的fdw表中的行数以及生成的假id的数量有关。

运行此:

CREATE OR REPLACE FUNCTION jegfatterintet() RETURNS TABLE (c BIGINT)
AS $$
DECLARE
jammerdal_ids VARCHAR[];
area_row RECORD;
start TIMESTAMP;
BEGIN
SELECT INTO jammerdal_ids ARRAY_AGG('id-'||x::VARCHAR) FROM generate_series(0,25) x;
FOR area_row IN SELECT * FROM generate_series(1,10)
LOOP
SELECT INTO start clock_timestamp();
--RAISE NOTICE '%: Start %.', clock_timestamp(), area_row;
RETURN QUERY
SELECT COUNT(*) FROM jammerdal WHERE id = ANY(jammerdal_ids);
--RAISE NOTICE '%: End %.', clock_timestamp(), area_row;
RAISE NOTICE '%: Duration is %.', area_row, clock_timestamp()-start;
END LOOP;
RAISE NOTICE '%: All done.', clock_timestamp();
END
$$ LANGUAGE plpgsql;    
SELECT * FROM jegfatterintet();

产生如下输出:

CREATE FUNCTION
NOTICE:  (1): Duration is 00:00:00.019555.
NOTICE:  (2): Duration is 00:00:00.001271.
NOTICE:  (3): Duration is 00:00:00.001089.
NOTICE:  (4): Duration is 00:00:00.00118.
NOTICE:  (5): Duration is 00:00:00.001035.
NOTICE:  (6): Duration is 00:00:02.954527.
NOTICE:  (7): Duration is 00:00:02.871185.
NOTICE:  (8): Duration is 00:00:02.812426.
NOTICE:  (9): Duration is 00:00:02.777037.
NOTICE:  (10): Duration is 00:00:02.90708.
NOTICE:  2021-09-07 11:21:53.577115+00: All done.
c 
---
0
0
0
0
0
0
0
0
0
0
(10 rows)

请注意,从第6步开始,持续时间突然从0.01秒以下上升到几乎3秒。

只有当jammerdal是外部表(fdw)时才会发生这种情况,而不是本地表。它只在使用id数组时发生。

如果我将函数更改为:

CREATE OR REPLACE FUNCTION jegfatterintet() RETURNS TABLE (c BIGINT)
AS $$
DECLARE
jammerdal_ids VARCHAR[];
area_row RECORD;
start TIMESTAMP;
BEGIN
SELECT INTO jammerdal_ids ARRAY_AGG('id-'||x::VARCHAR) FROM generate_series(0,25) x;
FOR area_row IN SELECT * FROM generate_series(1,10)
LOOP
SELECT INTO start clock_timestamp();
--RAISE NOTICE '%: Start %.', clock_timestamp(), area_row;
RETURN QUERY
SELECT COUNT(*) FROM jammerdal WHERE id IN ('id-0', 'id-1', 'id-2', 'id-3', 'id-4', 'id-5', 'id-6', 'id-7', 'id-8', 'id-9', 'id-10', 'id-11', 'id-12', 'id-13', 'id-14', 'id-15', 'id-16', 'id-17', 'id-18', 'id-19', 'id-20', 'id-21', 'id-22', 'id-23', 'id-24', 'id-25'); --id = ANY(jammerdal_ids);
--RAISE NOTICE '%: End %.', clock_timestamp(), area_row;
RAISE NOTICE '%: Duration is %.', area_row, clock_timestamp()-start;
END LOOP;
RAISE NOTICE '%: All done.', clock_timestamp();
END
$$ LANGUAGE plpgsql;
SELECT * FROM jegfatterintet();

输出成为:

CREATE FUNCTION
NOTICE:  (1): Duration is 00:00:00.028254.
NOTICE:  (2): Duration is 00:00:00.001768.
NOTICE:  (3): Duration is 00:00:00.001512.
NOTICE:  (4): Duration is 00:00:00.001426.
NOTICE:  (5): Duration is 00:00:00.001523.
NOTICE:  (6): Duration is 00:00:00.001389.
NOTICE:  (7): Duration is 00:00:00.001363.
NOTICE:  (8): Duration is 00:00:00.001364.
NOTICE:  (9): Duration is 00:00:00.001466.
NOTICE:  (10): Duration is 00:00:00.001454.
NOTICE:  2021-09-07 11:25:46.635762+00: All done.
c 
---
0
0
0
0
0
0
0
0
0
0
(10 rows)
有谁能给我解释一下吗?

编辑:

explain (ANALYZE, BUFFERS) select id from jammerdal where id = any('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}');
QUERY PLAN                                                      
---------------------------------------------------------------------------------------------------------------------
Foreign Scan on jammerdal  (cost=100.00..62578.95 rows=26 width=37) (actual time=0.718..0.719 rows=0 loops=1)
Planning Time: 0.153 ms
Execution Time: 1.101 ms
(3 rows)

也是一个EXPLAIN运行在"foreign"数据库:

explain (ANALYZE, BUFFERS) select id from jammerdal where id = any('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}');
                  QUERY PLAN                                                                                        
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using jammerdal_pkey on jammerdal  (cost=0.42..219.41 rows=26 width=37) (actual time=0.286..0.290 rows=0 loops=1)
Index Cond: (id = ANY ('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}'::text[]))
Heap Fetches: 0
Buffers: shared hit=81
Planning:
Buffers: shared hit=205
Planning Time: 2.111 ms
Execution Time: 0.423 ms
(8 rows)

ANALYZE jammerdal;

无效

编辑2:问题显然是fdw表没有在id…

上使用索引。
ALTER SERVER testdb  OPTIONS (ADD use_remote_estimate 'true');

成功了!

由于查询是函数,PostgreSQL缓存执行计划。这是根据一个特殊的启发式来工作的:

  • 对于前五次执行,PostgreSQL生成一个"自定义计划",使用实际参数值(jammerdal_ids)

  • 在第六次执行时,PostgreSQL检查"通用计划"(忽略参数值)是否估计执行良好

  • 如果是,则从第六次执行开始使用通用计划以节省规划时间

在你的情况下,通用计划显然是坏的。

由于您没有显示EXPLAIN (ANALYZE, BUFFERS)输出,我们只能猜测原因。但一个好的猜测是,你忘记了ANALYZE的外国表,并有不好的统计数据。所以用

ANALYZE jammerdal;

,你应该会注意到改善。

(注意PostgreSQL不会自动收集外表的统计信息)

相关内容

  • 没有找到相关文章

最新更新