Postgres查询中冷启动查询性能不一致



我们在尝试为Postgres托管的数据集市构建查询时遇到了问题。我们的查询很简单,只包含少量的数据。我们已经看到这个查询在两次运行之间的执行时间有很大的差异——有时大约需要20秒,有时只需要3秒——但是我们似乎不知道是什么导致了这些差异,我们的目标是得到一致的结果。查询中只涉及2个表,一个表表示订单行(OrderItemTransactionFact 2,937,264行),另一个表记录每个商品的当前库存水平(库存水平62,353行)。没有外键,因为这是一个我们运行ETL进程的数据集,所以需要快速加载。

查询为;

select 
oitf."SKUId",
sum(oitf."ConvertedLineTotal") as "totalrevenue",
sum(oitf."Quantity") as "quantitysold",
coalesce (sl."Available",0) as "availablestock"
from "OrderItemTransactionFact" oitf
left join stocklevels sl on sl."SKUId" = oitf."SKUId"
where 
oitf."transactionTypeId" = 2 
and oitf."hasComposite" = false
and oitf."ReceivedDate" >= extract(epoch from timestamp '2020-07-01 00:00:00') 
and oitf."ReceivedDate" <= extract(epoch from timestamp '2021-10-01 00:00:00')
group by 
oitf."SKUId", sl."Available"
order by oitf."SKUId";

OrderItemTransactionFact表有一对索引;

create index IX_OrderItemTransactionFact_ReceivedDate on public."OrderItemTransactionFact" ("ReceivedDate" DESC);
create index IX_OrderItemTransactionFact_ReceivedDate_transactionTypeId on public."OrderItemTransactionFact" ("ReceivedDate" desc, "transactionTypeId");

26秒运行的执行计划输出为;

GroupAggregate  (cost=175096.24..195424.66 rows=813137 width=52) (actual time=24100.268..24874.065 rows=26591 loops=1)
Group Key: oitf."SKUId", sl."Available"
Buffers: shared hit=659 read=43311 written=1042
->  Sort  (cost=175096.24..177129.08 rows=813137 width=19) (actual time=24100.249..24275.594 rows=916772 loops=1)
Sort Key: oitf."SKUId", sl."Available"
Sort Method: quicksort  Memory: 95471kB
Buffers: shared hit=659 read=43311 written=1042
->  Hash Left Join  (cost=20671.85..95274.08 rows=813137 width=19) (actual time=239.392..23127.993 rows=916772 loops=1)
Hash Cond: (oitf."SKUId" = sl."SKUId")
Buffers: shared hit=659 read=43311 written=1042
->  Bitmap Heap Scan on "OrderItemTransactionFact" oitf  (cost=18091.90..73485.91 rows=738457 width=15) (actual time=200.178..22413.601 rows=701397 loops=1)
Recheck Cond: (("ReceivedDate" >= '1585699200'::double precision) AND ("ReceivedDate" <= '1625097600'::double precision))
Filter: ((NOT "hasComposite") AND ("transactionTypeId" = 2))
Rows Removed by Filter: 166349
Heap Blocks: exact=40419
Buffers: shared hit=55 read=42738 written=1023
->  Bitmap Index Scan on ix_orderitemtransactionfact_receiveddate  (cost=0.00..17907.29 rows=853486 width=0) (actual time=191.274..191.274 rows=867746 loops=1)
Index Cond: (("ReceivedDate" >= '1585699200'::double precision) AND ("ReceivedDate" <= '1625097600'::double precision))
Buffers: shared hit=9 read=2365 written=181
->  Hash  (cost=1800.53..1800.53 rows=62353 width=8) (actual time=38.978..38.978 rows=62353 loops=1)
Buckets: 65536  Batches: 1  Memory Usage: 2948kB
Buffers: shared hit=604 read=573 written=19
->  Seq Scan on stocklevels sl  (cost=0.00..1800.53 rows=62353 width=8) (actual time=0.031..24.301 rows=62353 loops=1)
Buffers: shared hit=604 read=573 written=19
Planning Time: 0.543 ms
Execution Time: 24889.522 ms

但是当同样的查询只需要3秒时,执行计划;

GroupAggregate  (cost=173586.52..193692.59 rows=804243 width=52) (actual time=2616.588..3220.394 rows=26848 loops=1)
Group Key: oitf."SKUId", sl."Available"
Buffers: shared hit=2 read=43929
->  Sort  (cost=173586.52..175597.13 rows=804243 width=19) (actual time=2616.570..2813.571 rows=889937 loops=1)
Sort Key: oitf."SKUId", sl."Available"
Sort Method: quicksort  Memory: 93001kB
Buffers: shared hit=2 read=43929
->  Hash Left Join  (cost=20472.48..94701.25 rows=804243 width=19) (actual time=185.018..1512.626 rows=889937 loops=1)
Hash Cond: (oitf."SKUId" = sl."SKUId")
Buffers: shared hit=2 read=43929
->  Bitmap Heap Scan on "OrderItemTransactionFact" oitf  (cost=17892.54..73123.18 rows=730380 width=15) (actual time=144.000..960.232 rows=689090 loops=1)
Recheck Cond: (("ReceivedDate" >= '1593561600'::double precision) AND ("ReceivedDate" <= '1633046400'::double precision))
Filter: ((NOT "hasComposite") AND ("transactionTypeId" = 2))
Rows Removed by Filter: 159949
Heap Blocks: exact=40431
Buffers: shared read=42754
->  Bitmap Index Scan on ix_orderitemtransactionfact_receiveddate  (cost=0.00..17709.94 rows=844151 width=0) (actual time=134.806..134.806 rows=849039 loops=1)
Index Cond: (("ReceivedDate" >= '1593561600'::double precision) AND ("ReceivedDate" <= '1633046400'::double precision))
Buffers: shared read=2323
->  Hash  (cost=1800.53..1800.53 rows=62353 width=8) (actual time=40.500..40.500 rows=62353 loops=1)
Buckets: 65536  Batches: 1  Memory Usage: 2948kB
Buffers: shared hit=2 read=1175
->  Seq Scan on stocklevels sl  (cost=0.00..1800.53 rows=62353 width=8) (actual time=0.025..24.620 rows=62353 loops=1)
Buffers: shared hit=2 read=1175
Planning Time: 0.565 ms
Execution Time: 3235.300 ms

服务器配置为;版本:PostgreSQL 12.1,由Visual c++ build 1914编译,64位Work_mem: 1048576kbShared_buffers: 16384 (x8kb)

提前感谢!

是文件系统缓存。速度慢的那台必须从磁盘上读取数据。速度快的只需要从内存中取出数据,可能是因为速度慢的已经读取了数据并将其留在了内存中。您可以通过打开track_io_timing使其在计划中显式显示。

("transactionTypeId","hasComposite","ReceivedDate")上有一个索引应该会有一点帮助,也许对提高有效性有很大帮助(取决于您的存储系统)。

但大多数情况下,使用更快的磁盘。

最新更新