Postgres查询一个指向外部数据工作者的分区表比直接查询fdw慢很多倍



提前感谢任何可能帮助我的人。

我有一个Postgres 15.2数据库,它使用分区表通过Zone id访问外部数据工作者表。每个区域在每个区域数据库中进一步按时间段分解数据。

我正在开发的数据库接近10亿条记录,但会越来越大。如果我直接从lead数据库查询fdw表,则速度非常快,大约为12ms。如果我查询指向相同确切位置的主数据库分区表,则总是需要大约一分钟。如果我在where子句中指定分区,这是分区的检查约束,它仍然很慢。

我花了几天时间在这里和其他地方寻找解决方案,但到目前为止一直没有结果。我还尝试简化表的模式,看看它是否与外键或约束有关,但似乎也不是这样。

在新创建的数据库或我一直在使用的长时间运行的测试系统上似乎也会发生相同的情况。

分析分区表似乎没有任何效果。

下面是我用于最小测试设置的linux bash脚本,它可以在一台机器上以更小的规模重现该行为。它创建一个领导数据库和任意数量的区域数据库,并连接所有的外部数据工作者。先导DB位于127.0.0.100,每个区域DB位于127.0.0.100 +区域号。我必须做到这一点,从根级目录,postgres用户有权访问。注意:它启动所有数据库,我不建议运行它,除非在一个虚拟机与固态驱动器和在一个空文件夹。我在下面展示了结果。

cat BUILD_LEAD_AND_ZONES.sh 
#!/usr/bin/env bash
#ARG 1 DBNAME
#ARG 2 a directory that postgres user has permissions to so DBs can be started there
#ARG 3 NUMBER OF ZONES TO CREATE  ALL DBs will be created in the ARG 2 directory with a 127.0.0.X addres where Leader is at 100 and each zone is at 100 + zone number. Can move them somewhere else afterwards
set -e #exit on any non-zero return-code
set -x #echo commands
set -euo pipefail
IFS=$'nt'

DBNAME=$1
ROOT_TRGT_DIR=$2
NUMBER_OF_ZONES=$3
BASE_IP4="127.0.0." #with a trailing dot x.x.x.
SCHEMA_NAME="wha"
DBFOLDER="DB_SCHEMA"
FDWPASSWORD="Wh00p33"
createDbAndBaseTables () {
echo "Creating Database $1 $2"
psql -h $1 -U postgres -c "CREATE DATABASE "$2";"
psql -h $1 -U postgres -d $2 -c "CREATE SCHEMA $SCHEMA_NAME;"
if [[ $3 = "LEAD" ]]
then
echo "LEAD Testing.sql"
PGOPTIONS="--search_path=$SCHEMA_NAME" psql -h $1 -U postgres -d $2 <<SCRIPT
CREATE TABLE "Testing" (
"TestingId" BIGSERIAL
,"Period" INT
,"Zone" INT
,"UTCStart" timestamp NOT NULL
,"UTCEnd" timestamp NOT NULL
,"Stuff" TEXT
) PARTITION BY LIST ("Zone");
SCRIPT
else
echo "ZONE Testing.sql"
PGOPTIONS="--search_path=$SCHEMA_NAME" psql -h $1 -U postgres -d $2 <<SCRIPT
CREATE TABLE "Testing" (
"TestingId" BIGSERIAL
,"Period" INT
,"Zone" INT
,"UTCStart" timestamp NOT NULL
,"UTCEnd" timestamp NOT NULL
,"Stuff" text
,PRIMARY KEY ("TestingId","Period","Zone")
);
SCRIPT
fi
}

#Create each Lead and Zone and customize their pg_hba.conf and postgresql.conf to run and allow access from the correct addresses
for ((i=0; i <= $NUMBER_OF_ZONES; i++ )) 
do
((ip=$i+100))
if [[ $i = 0 ]]; then
echo "Creating LEADER DB $BASE_IP4$ip"
DIR="$ROOT_TRGT_DIR/PSQL_LEAD"
else
echo "Creating ZONE DB $BASE_IP4$ip"
DIR="$ROOT_TRGT_DIR/PSQL_$i"
fi
mkdir -p "$DIR"
sudo chown postgres:postgres "$DIR"
sudo -u postgres initdb -D "$DIR"
sudo -u postgres mkdir -p "$DIR/sockets"
if [[ $i = 0 ]]; then
#BUILD UP pg_hba.conf access list
ACCESS_LIST="host    all             fdwuser         $BASE_IP4$ip/32        password"
ACCESS_LIST="$ACCESS_LISTnhost    all             postgres        $BASE_IP4$ip/32        trust"
ACCESS_LIST="$ACCESS_LISTnhost    all             postgres        ${BASE_IP4}1/32        trust"
ACCESS_LIST="$ACCESS_LISTnhost    all             postgres        127.0.0.1/32        trust"
ACCESS_LIST="$ACCESS_LISTnhost    all             fdwuser         127.0.0.1/32        password"
ACCESS_LIST="$ACCESS_LISTnhost    all             fdwuser        ${BASE_IP4}1/32      password"
for ((k=1; k<= $NUMBER_OF_ZONES; k++ ))
do
((zip=$k+100))
ACCESS_LIST="$ACCESS_LISTnhost    all             fdwuser         $BASE_IP4$zip/32        password"
done
sed -i "s/host    all             all             127.0.0.1/32            trust/$ACCESS_LISTn/" "$DIR/pg_hba.conf"
echo "sudo -u postgres pg_ctl -D "$DIR" -l "$DIR/psql.log" start" > "start_PSQL_LEAD"
chmod +x "start_PSQL_LEAD"
else
ACCESS_LIST="host    all            fdwuser          $BASE_IP4$ip/32        password"
ACCESS_LIST="$ACCESS_LISTnhost    all            postgres         $BASE_IP4$ip/32        trust"
ACCESS_LIST="$ACCESS_LISTnhost    all            postgres         127.0.0.1/32        trust"
ACCESS_LIST="$ACCESS_LISTnhost    all            fdwuser          ${BASE_IP4}1/32        password"
ACCESS_LIST="$ACCESS_LISTnhost    all            fdwuser          127.0.0.1/32        password"
ACCESS_LIST="$ACCESS_LISTnhost    all            postgres         ${BASE_IP4}1/32        trust"
sed -i "s/host    all             all             127.0.0.1/32            trust/$ACCESS_LISTn/" "$DIR/pg_hba.conf"
echo "sudo -u postgres pg_ctl -D "$DIR" -l "$DIR/psql.log" start" > "start_PSQL_$i"
chmod +x "start_PSQL_$i"
fi
sed -i "s/#listen_addresses = 'localhost'/listen_addresses = '$BASE_IP4$ip'/" "$DIR/postgresql.conf"
sed -i "s/#wal_buffers = -1/wal_buffers = 16MB/" "$DIR/postgresql.conf"
#wacky ${DIR....} is escaping the front slashes so sed doesn't blow up
sed -i "s/#unix_socket_directories = '/run/postgresql'/unix_socket_directories = '${DIR////\/}/sockets'/" "$DIR/postgresql.conf"
done
#start servers and build db instances
for ((i=0; i <= $NUMBER_OF_ZONES; i++))
do
((ip=$i+100))
if [[ $i = 0 ]]
then
./start_PSQL_LEAD
createDbAndBaseTables "$BASE_IP4$ip" "$DBNAME" "LEAD"
else
./start_PSQL_$i
createDbAndBaseTables "$BASE_IP4$ip" "$DBNAME" "ZONE"
fi
done
#set up user for lead to access each zone with
for ((i=0; i <= $NUMBER_OF_ZONES; i++))
do
((ip=$i+100))
psql -h ${BASE_IP4}$ip -U postgres -d $DBNAME <<SCRIPT
CREATE EXTENSION postgres_fdw;
CREATE USER fdwuser WITH ENCRYPTED PASSWORD '$FDWPASSWORD';
GRANT ALL PRIVILEGES ON SCHEMA $SCHEMA_NAME to fdwuser;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA $SCHEMA_NAME to fdwuser;
ALTER TABLE "Testing" ALTER COLUMN "Zone" SET DEFAULT $k;
SCRIPT
done
for ((i=0; i <= $NUMBER_OF_ZONES; i++))
do
((ip=$i+100))
if [[ $i = 0 ]] #LEAD LEAD LEAD LEAD LEAD
then
#       PGOPTIONS="--search_path=$SCHEMA_NAME" psql -U postgres -d $DBNAME -h $BASE_IP4$ip -f "$DBFOLDER/TBL/" ... ...Create lead schema here
#       PGOPTIONS="--search_path=$SCHEMA_NAME" psql -U postgres -d $DBNAME -h $BASE_IP4$ip -f "$DBFOLDER/PROC/" ... ...Create lead schema here
#       PGOPTIONS="--search_path=$SCHEMA_NAME" psql -U postgres -d $DBNAME -h $BASE_IP4$ip -f "$DBFOLDER/VIEW/" ... ...Create lead schema here

for ((k=1; k<= $NUMBER_OF_ZONES; k++ ))
do
((zip=$k+100))
psql -h ${BASE_IP4}100 -U postgres -d $DBNAME <<SCRIPT
CREATE SERVER "Zone$k" FOREIGN DATA WRAPPER postgres_fdw OPTIONS (dbname '$DBNAME', host '$BASE_IP4$zip', port '5432',use_remote_estimate 'true', fetch_size '50000');
GRANT USAGE ON FOREIGN SERVER "Zone$k" TO fdwuser;
CREATE USER MAPPING for fdwuser SERVER "Zone$k" OPTIONS (user 'fdwuser', password '$FDWPASSWORD');
CREATE USER MAPPING for postgres SERVER "Zone$k" OPTIONS (user 'fdwuser', password '$FDWPASSWORD');
CREATE SCHEMA "Zone$k";
GRANT ALL PRIVILEGES ON SCHEMA "Zone$k" to fdwuser;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA "Zone$k" to fdwuser;
GRANT ALL PRIVILEGES ON ALL SEQUENCES IN SCHEMA $SCHEMA_NAME TO fdwuser;
SCRIPT
PGOPTIONS="--search_path=$SCHEMA_NAME" PGPASSWORD=$FDWPASSWORD psql -h "${BASE_IP4}100" -U postgres -d $DBNAME -c "IMPORT FOREIGN SCHEMA $SCHEMA_NAME FROM SERVER "Zone$k" INTO "Zone$k";"
PGOPTIONS="--search_path=$SCHEMA_NAME" PGPASSWORD=$FDWPASSWORD psql -h "${BASE_IP4}100" -U postgres -d $DBNAME -c "ALTER TABLE "Testing" ATTACH PARTITION "Zone$k"."Testing" FOR VALUES IN ($k);"
done
else #ZONE ZONE ZONE ZONE ZONE
#create a mockup of how zones are setup
for ((k=1; k<= 12; k++ ))
do
printf -v MM "%02d" $k
PGOPTIONS="--search_path=$SCHEMA_NAME" psql -h $BASE_IP4$ip -U postgres -d $DBNAME <<SCRIPT
CREATE TABLE "Testing2022$k" (CHECK ("Period"='2022$k')) INHERITS("Testing");
ALTER TABLE "Testing2022$k" ALTER COLUMN "Zone" SET DEFAULT $i;
INSERT INTO "Testing2022$k"("Period","UTCStart","UTCEnd","Stuff")
SELECT '2022$k','2022-${MM}-01 00:00:00'::timestamp + (seq || ' Minute')::INTERVAL, '2022-${MM}-01 00:00:00'::timestamp + (seq || ' Minute')::INTERVAL + ('50 Second')::INTERVAL, seq::text from generate_series(0,100000) seq;
CREATE INDEX idx_t2022${k}_utce_utcs ON "Testing2022${k}"("UTCEnd","UTCStart");
CREATE INDEX idx_t2022${k}_utcs_utce ON "Testing2022${k}"("UTCStart","UTCEnd");
CREATE INDEX idx_t2022${k}_a ON "Testing2022${k}"("Stuff");
SCRIPT

done
#
fi
done

这是铅数据库的结果大小,有8个区域,每个区域有12个周期PGOPTIONS="--search-path=wha" psql -U postgres -d TestDB -h 127.0.0.100

select count(*) from "Testing";
count  
---------
9600096
(1 row)

直接查询非常快,约14ms

select * from "Zone6"."Testing" where "Zone" = 6 AND "UTCStart" > '2022-03-01' ORDER BY "UTCStart" DESC limit 100;
...
1199913 | 202212 |    6 | 2023-02-08 09:01:00 | 2023-02-08 09:01:50 | 99901
(100 rows)
Time: 13.564 ms

查询分区要慢得多,大约87ms(但仍然很快,因为数据集很小)

select * from wha."Testing" where "Zone" = 6 AND "UTCStart" > '2022-03-01' ORDER BY "UTCStart" DESC limit 100;
...
1199913 | 202212 |    6 | 2023-02-08 09:01:00 | 2023-02-08 09:01:50 | 99901
(100 rows)
Time: 87.156 ms

解释如下:

From lead partition table

explain (ANALYZE, TIMING, COSTS, BUFFERS, VERBOSE) select * from wha."Testing" where "Zone" = 6 AND "UTCStart" > '2022-03-01' ORDER BY "UTCStart" DESC limit 100;
                             QUERY PLAN                                                                                                    
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=105.26..115.26 rows=100 width=64) (actual time=75.312..75.321 rows=100 loops=1)
Output: "Testing"."TestingId", "Testing"."Period", "Testing"."Zone", "Testing"."UTCStart", "Testing"."UTCEnd", "Testing"."Stuff"
->  Foreign Scan on "Zone6"."Testing"  (cost=105.26..107582.12 rows=1074326 width=37) (actual time=75.309..75.314 rows=100 loops=1)
Output: "Testing"."TestingId", "Testing"."Period", "Testing"."Zone", "Testing"."UTCStart", "Testing"."UTCEnd", "Testing"."Stuff"
Remote SQL: SELECT "TestingId", "Period", "Zone", "UTCStart", "UTCEnd", "Stuff" FROM wha."Testing" WHERE (("UTCStart" > '2022-03-01 00:00:00')) AND (("Zone" = 6)) ORDER BY "UTCStart" DESC NULLS FIRST
Planning Time: 5.638 ms
Execution Time: 76.320 ms
(7 rows)
Time: 82.503 ms

和直接

explain (ANALYZE, TIMING, COSTS, BUFFERS, VERBOSE) select * from "Zone6"."Testing" where "Zone" = 6 AND "UTCStart" > '2022-03-01' ORDER BY "UTCStart" DESC limit 100;
                                   QUERY PLAN                                                                                                          
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Foreign Scan on "Zone6"."Testing"  (cost=105.26..115.26 rows=100 width=64) (actual time=3.935..3.952 rows=100 loops=1)
Output: "TestingId", "Period", "Zone", "UTCStart", "UTCEnd", "Stuff"
Remote SQL: SELECT "TestingId", "Period", "Zone", "UTCStart", "UTCEnd", "Stuff" FROM wha."Testing" WHERE (("UTCStart" > '2022-03-01 00:00:00')) AND (("Zone" = 6)) ORDER BY "UTCStart" DESC NULLS FIRST LIMIT 100::bigint
Planning Time: 8.193 ms
Execution Time: 4.277 ms
(5 rows)
Time: 13.373 ms

我不是很擅长解析解释输出,但为什么在"Zone6"这两个解释有什么不同吗?

我知道我可以编写一个函数来考虑分区并避免分区表,但逻辑将变得足够复杂,我试图避免它。关于为什么分区选择如此慢,我是否遗漏了什么?还有什么方法可以加快查询分区的速度吗?

谢谢!

这两个查询的区别在于,在第二个查询中,LIMIT 100被下推到远程数据库。这使得远程数据库选择一个能够快速交付前100个结果的执行计划。否则,PostgreSQL会优化最快的计划来返回完整的结果集。执行EXPLAIN命令对远程数据库进行远程查询,看是否能发现差异

如果你查询分区表,PostgreSQL似乎不推下LIMIT。我没有仔细阅读代码来找出原因,但原则上应该是可能的。也许它只是没有实现。

最新更新