我可以查询 PostgreSQL 在pg_dump
和pg_restore
期间的缓冲区缓存行为吗?
众所周知,PostgreSQL有一个缓冲区缓存来缓存最近的工作集,Linux也有它的文件系统级缓存。
当我们使用pg_dump
备份数据库时,备份操作会影响 PostgreSQL 缓冲区缓存和系统文件缓存吗?
那么pg_restore
操作呢?
由于这些操作在机器上读取或写入文件,它们肯定会影响内核的文件系统缓存,可能会吹走一些以前缓存的数据。
PostgreSQL 共享缓冲区也是如此,尽管有一个优化可以避免在大型顺序扫描期间覆盖所有共享缓冲区:如果共享缓冲区上的表大于四分之一,则将使用 256 kB 的环形缓冲区,而不是消除缓存的主要部分。
请参阅以下来自src/backend/access/heap/heapam.c
和src/backend/storage/buffer/README
的引文:
/*
* If the table is large relative to NBuffers, use a bulk-read access
* strategy and enable synchronized scanning (see syncscan.c). Although
* the thresholds for these features could be different, we make them the
* same so that there are only two behaviors to tune rather than four.
* (However, some callers need to be able to disable one or both of these
* behaviors, independently of the size of the table; also there is a GUC
* variable that can disable synchronized scanning.)
*
* Note that table_block_parallelscan_initialize has a very similar test;
* if you change this, consider changing that one, too.
*/
if (!RelationUsesLocalBuffers(scan->rs_base.rs_rd) &&
scan->rs_nblocks > NBuffers / 4)
{
allow_strat = (scan->rs_base.rs_flags & SO_ALLOW_STRAT) != 0;
allow_sync = (scan->rs_base.rs_flags & SO_ALLOW_SYNC) != 0;
}
else
allow_strat = allow_sync = false;
For sequential scans, a 256KB ring is used. That's small enough to fit in L2
cache, which makes transferring pages from OS cache to shared buffer cache
efficient. Even less would often be enough, but the ring must be big enough
to accommodate all pages in the scan that are pinned concurrently. 256KB
should also be enough to leave a small cache trail for other backends to
join in a synchronized seq scan. If a ring buffer is dirtied and its LSN
updated, we would normally have to write and flush WAL before we could
re-use the buffer; in this case we instead discard the buffer from the ring
and (later) choose a replacement using the normal clock-sweep algorithm.
Hence this strategy works best for scans that are read-only (or at worst
update hint bits). In a scan that modifies every page in the scan, like a
bulk UPDATE or DELETE, the buffers in the ring will always be dirtied and
the ring strategy effectively degrades to the normal strategy.
如自述文件所示,该策略对于批量写入可能不是很有效。
尽管如此,pg_dump
或pg_restore
仍会影响许多表,因此您可以预期它会破坏很大一部分共享缓冲区。