我有一个postgres 13.3数据库运行在CentOS机器上。在执行这些测试时,此机器上没有运行任何其他内容,并且在执行测试时没有其他内容访问数据库。
jammerdal表包含大约50万行。我对其他包含大约50,000行的表重复了这个实验。结果是一样的,但是速度变慢似乎与所使用的fdw表中的行数以及生成的假id的数量有关。
运行此:
CREATE OR REPLACE FUNCTION jegfatterintet() RETURNS TABLE (c BIGINT)
AS $$
DECLARE
jammerdal_ids VARCHAR[];
area_row RECORD;
start TIMESTAMP;
BEGIN
SELECT INTO jammerdal_ids ARRAY_AGG('id-'||x::VARCHAR) FROM generate_series(0,25) x;
FOR area_row IN SELECT * FROM generate_series(1,10)
LOOP
SELECT INTO start clock_timestamp();
--RAISE NOTICE '%: Start %.', clock_timestamp(), area_row;
RETURN QUERY
SELECT COUNT(*) FROM jammerdal WHERE id = ANY(jammerdal_ids);
--RAISE NOTICE '%: End %.', clock_timestamp(), area_row;
RAISE NOTICE '%: Duration is %.', area_row, clock_timestamp()-start;
END LOOP;
RAISE NOTICE '%: All done.', clock_timestamp();
END
$$ LANGUAGE plpgsql;
SELECT * FROM jegfatterintet();
产生如下输出:
CREATE FUNCTION
NOTICE: (1): Duration is 00:00:00.019555.
NOTICE: (2): Duration is 00:00:00.001271.
NOTICE: (3): Duration is 00:00:00.001089.
NOTICE: (4): Duration is 00:00:00.00118.
NOTICE: (5): Duration is 00:00:00.001035.
NOTICE: (6): Duration is 00:00:02.954527.
NOTICE: (7): Duration is 00:00:02.871185.
NOTICE: (8): Duration is 00:00:02.812426.
NOTICE: (9): Duration is 00:00:02.777037.
NOTICE: (10): Duration is 00:00:02.90708.
NOTICE: 2021-09-07 11:21:53.577115+00: All done.
c
---
0
0
0
0
0
0
0
0
0
0
(10 rows)
请注意,从第6步开始,持续时间突然从0.01秒以下上升到几乎3秒。
只有当jammerdal是外部表(fdw)时才会发生这种情况,而不是本地表。它只在使用id数组时发生。
如果我将函数更改为:
CREATE OR REPLACE FUNCTION jegfatterintet() RETURNS TABLE (c BIGINT)
AS $$
DECLARE
jammerdal_ids VARCHAR[];
area_row RECORD;
start TIMESTAMP;
BEGIN
SELECT INTO jammerdal_ids ARRAY_AGG('id-'||x::VARCHAR) FROM generate_series(0,25) x;
FOR area_row IN SELECT * FROM generate_series(1,10)
LOOP
SELECT INTO start clock_timestamp();
--RAISE NOTICE '%: Start %.', clock_timestamp(), area_row;
RETURN QUERY
SELECT COUNT(*) FROM jammerdal WHERE id IN ('id-0', 'id-1', 'id-2', 'id-3', 'id-4', 'id-5', 'id-6', 'id-7', 'id-8', 'id-9', 'id-10', 'id-11', 'id-12', 'id-13', 'id-14', 'id-15', 'id-16', 'id-17', 'id-18', 'id-19', 'id-20', 'id-21', 'id-22', 'id-23', 'id-24', 'id-25'); --id = ANY(jammerdal_ids);
--RAISE NOTICE '%: End %.', clock_timestamp(), area_row;
RAISE NOTICE '%: Duration is %.', area_row, clock_timestamp()-start;
END LOOP;
RAISE NOTICE '%: All done.', clock_timestamp();
END
$$ LANGUAGE plpgsql;
SELECT * FROM jegfatterintet();
输出成为:
CREATE FUNCTION
NOTICE: (1): Duration is 00:00:00.028254.
NOTICE: (2): Duration is 00:00:00.001768.
NOTICE: (3): Duration is 00:00:00.001512.
NOTICE: (4): Duration is 00:00:00.001426.
NOTICE: (5): Duration is 00:00:00.001523.
NOTICE: (6): Duration is 00:00:00.001389.
NOTICE: (7): Duration is 00:00:00.001363.
NOTICE: (8): Duration is 00:00:00.001364.
NOTICE: (9): Duration is 00:00:00.001466.
NOTICE: (10): Duration is 00:00:00.001454.
NOTICE: 2021-09-07 11:25:46.635762+00: All done.
c
---
0
0
0
0
0
0
0
0
0
0
(10 rows)
有谁能给我解释一下吗?编辑:
explain (ANALYZE, BUFFERS) select id from jammerdal where id = any('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Foreign Scan on jammerdal (cost=100.00..62578.95 rows=26 width=37) (actual time=0.718..0.719 rows=0 loops=1)
Planning Time: 0.153 ms
Execution Time: 1.101 ms
(3 rows)
也是一个EXPLAIN运行在"foreign"数据库:
explain (ANALYZE, BUFFERS) select id from jammerdal where id = any('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}');
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using jammerdal_pkey on jammerdal (cost=0.42..219.41 rows=26 width=37) (actual time=0.286..0.290 rows=0 loops=1)
Index Cond: (id = ANY ('{id-0,id-1,id-2,id-3,id-4,id-5,id-6,id-7,id-8,id-9,id-10,id-11,id-12,id-13,id-14,id-15,id-16,id-17,id-18,id-19,id-20,id-21,id-22,id-23,id-24,id-25}'::text[]))
Heap Fetches: 0
Buffers: shared hit=81
Planning:
Buffers: shared hit=205
Planning Time: 2.111 ms
Execution Time: 0.423 ms
(8 rows)
率
ANALYZE jammerdal;
无效
编辑2:问题显然是fdw表没有在id…
上使用索引。ALTER SERVER testdb OPTIONS (ADD use_remote_estimate 'true');
成功了!
由于查询是函数,PostgreSQL缓存执行计划。这是根据一个特殊的启发式来工作的:
-
对于前五次执行,PostgreSQL生成一个"自定义计划",使用实际参数值(
jammerdal_ids
) -
在第六次执行时,PostgreSQL检查"通用计划"(忽略参数值)是否估计执行良好
-
如果是,则从第六次执行开始使用通用计划以节省规划时间
在你的情况下,通用计划显然是坏的。
由于您没有显示EXPLAIN (ANALYZE, BUFFERS)
输出,我们只能猜测原因。但一个好的猜测是,你忘记了ANALYZE
的外国表,并有不好的统计数据。所以用
ANALYZE jammerdal;
,你应该会注意到改善。
(注意PostgreSQL不会自动收集外表的统计信息)