同时刷新实例化视图会导致表膨胀

在PostgreSQL 9.5中，我决定创建一个物化视图"效果"并计划每小时并发刷新，因为我希望它始终可用：

REFRESH MATERIALIZED VIEW CONCURRENTLY effects;

一开始一切正常，我的物化视图令人耳目一新，磁盘空间使用量或多或少保持不变。

问题

但一段时间后，磁盘使用量开始线性增长。

我得出的结论是，这种增长的原因是具体化视图，并从此答案运行查询以获得以下结果：

what                |  bytes/ct   | bytes_pretty | bytes_per_row
-----------------------------------+-------------+--------------+---------------
core_relation_size                | 32224567296 | 30 GB        |         21140
visibility_map                    |      991232 | 968 kB       |             0
free_space_map                    |     7938048 | 7752 kB      |             5
table_size_incl_toast             | 32233504768 | 30 GB        |         21146
indexes_size                      | 22975922176 | 21 GB        |         15073
total_size_incl_toast_and_indexes | 55209426944 | 51 GB        |         36220
live_rows_in_text_representation  |   316152215 | 302 MB       |           207
------------------------------    |             |              |
row_count                         |     1524278 |              |
live_tuples                       |      676439 |              |
dead_tuples                       |     1524208 |              |
(11 rows)

然后，我发现上一次自动吸尘此表是在两天前，通过运行：

SELECT relname, n_dead_tup, last_vacuum, last_autovacuum FROM pg_stat_user_tables ORDER BY n_dead_tup desc;

我决定手动调用vacuum (VERBOSE) effects.它运行了大约半个小时，并产生了以下输出：

vacuum (VERBOSE) effects;
INFO:  vacuuming "public.effects"
INFO:  scanned index "effects_idx" to remove 129523454 row versions
DETAIL:  CPU 12.16s/55.76u sec elapsed 119.87 sec
INFO:  scanned index "effects_campaign_created_idx" to remove 129523454 row versions
DETAIL:  CPU 19.11s/154.59u sec elapsed 337.91 sec
INFO:  scanned index "effects_campaign_name_idx" to remove 129523454 row versions
DETAIL:  CPU 28.51s/151.16u sec elapsed 315.51 sec
INFO:  scanned index "effects_campaign_event_type_idx" to remove 129523454 row versions
DETAIL:  CPU 38.60s/373.59u sec elapsed 601.73 sec
INFO:  "effects": removed 129523454 row versions in 3865537 pages
DETAIL:  CPU 59.02s/36.48u sec elapsed 326.43 sec
INFO:  index "effects_idx" now contains 1524208 row versions in 472258 pages
DETAIL:  113679000 index row versions were removed.
463896 index pages have been deleted, 60386 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.01 sec.
INFO:  index "effects_campaign_created_idx" now contains 1524208 row versions in 664910 pages
DETAIL:  121637488 index row versions were removed.
41014 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "effects_campaign_name_idx" now contains 1524208 row versions in 711391 pages
DETAIL:  125650677 index row versions were removed.
696221 index pages have been deleted, 28150 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  index "effects_campaign_event_type_idx" now contains 1524208 row versions in 956018 pages
DETAIL:  127659042 index row versions were removed.
934288 index pages have been deleted, 32105 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "effects": found 0 removable, 493 nonremovable row versions in 3880239 out of 3933663 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 666922 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 180.49s/788.60u sec elapsed 1799.42 sec.
INFO:  vacuuming "pg_toast.pg_toast_1371723"
INFO:  index "pg_toast_1371723_index" now contains 0 row versions in 1 pages
DETAIL:  0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO:  "pg_toast_1371723": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
Skipped 0 pages due to buffer pins.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
VACUUM

在这一点上，我认为问题已经解决了，并开始思考什么会干扰自动真空。可以肯定的是，我再次运行查询以查找该表的空间使用情况，令我惊讶的是它没有改变。

只有在我打电话REFRESH MATERIALIZED VIEW effects;没有同时打电话之后。直到现在，用于检查表大小的查询输出是：

what                | bytes/ct  | bytes_pretty | bytes_per_row
-----------------------------------+-----------+--------------+---------------
core_relation_size                | 374005760 | 357 MB       |           245
visibility_map                    |         0 | 0 bytes      |             0
free_space_map                    |         0 | 0 bytes      |             0
table_size_incl_toast             | 374013952 | 357 MB       |           245
indexes_size                      | 213843968 | 204 MB       |           140
total_size_incl_toast_and_indexes | 587857920 | 561 MB       |           385
live_rows_in_text_representation  | 316175512 | 302 MB       |           207
------------------------------    |           |              |
row_count                         |   1524385 |              |
live_tuples                       |    676439 |              |
dead_tuples                       |   1524208 |              |
(11 rows)

一切都恢复了正常...

问题

问题已经解决，但仍然存在相当多的混乱

谁能解释一下我遇到的问题是什么？
将来如何避免这种情况？

首先，让我们解释一下膨胀

REFRESH MATERIALIZED CONCURRENTLY在src/backend/commands/matview.c中实现，评论很有启发性：

/*
* refresh_by_match_merge
*
* Refresh a materialized view with transactional semantics, while allowing
* concurrent reads.
*
* This is called after a new version of the data has been created in a
* temporary table.  It performs a full outer join against the old version of
* the data, producing "diff" results.  This join cannot work if there are any
* duplicated rows in either the old or new versions, in the sense that every
* column would compare as equal between the two rows.  It does work correctly
* in the face of rows which have at least one NULL value, with all non-NULL
* columns equal.  The behavior of NULLs on equality tests and on UNIQUE
* indexes turns out to be quite convenient here; the tests we need to make
* are consistent with default behavior.  If there is at least one UNIQUE
* index on the materialized view, we have exactly the guarantee we need.
*
* The temporary table used to hold the diff results contains just the TID of
* the old record (if matched) and the ROW from the new table as a single
* column of complex record type (if matched).
*
* Once we have the diff table, we perform set-based DELETE and INSERT
* operations against the materialized view, and discard both temporary
* tables.
*
* Everything from the generation of the new data to applying the differences
* takes place under cover of an ExclusiveLock, since it seems as though we
* would want to prohibit not only concurrent REFRESH operations, but also
* incremental maintenance.  It also doesn't seem reasonable or safe to allow
* SELECT FOR UPDATE or SELECT FOR SHARE on rows being updated or deleted by
* this command.
*/

因此，通过删除行并从临时表中插入新行来刷新实例化视图。这当然会导致死元组和表膨胀，这在您的VACUUM (VERBOSE)输出中得到了证实。

在某种程度上，这就是你为CONCURRENTLY付出的代价。

其次，让我们揭穿`VACUUM`无法删除死元组的神话

VACUUM将删除死行，但它不能减少膨胀(这可以用VACUUM (FULL)来完成，但这会像没有CONCURRENTLYREFRESH MATERIALIZED VIEW一样锁定视图(。

我怀疑您用于确定死元组数的查询只是一个估计值，它弄错了死元组的数量。

一个演示所有这些的示例

CREATE TABLE tab AS SELECT id, 'row ' || id AS val FROM generate_series(1, 100000) AS id;
-- make sure autovacuum doesn't spoil our demonstration
CREATE MATERIALIZED VIEW tab_v WITH (autovacuum_enabled = off)
AS SELECT * FROM tab;
-- required for CONCURRENTLY
CREATE UNIQUE INDEX ON tab_v (id);

使用pgstattuple扩展准确测量表膨胀：

CREATE EXTENSION pgstattuple;
SELECT * FROM pgstattuple('tab_v');
-[ RECORD 1 ]------+--------
table_len          | 4431872
tuple_count        | 100000
tuple_len          | 3788895
tuple_percent      | 85.49
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 16724
free_percent       | 0.38

现在让我们删除表中的一些行，刷新并再次测量：

DELETE FROM tab WHERE id BETWEEN 40001 AND 80000;
REFRESH MATERIALIZED VIEW CONCURRENTLY tab_v;
SELECT * FROM pgstattuple('tab_v');
-[ RECORD 1 ]------+--------
table_len          | 4431872
tuple_count        | 60000
tuple_len          | 2268895
tuple_percent      | 51.19
dead_tuple_count   | 40000
dead_tuple_len     | 1520000
dead_tuple_percent | 34.3
free_space         | 16724
free_percent       | 0.38

很多死元组。VACUUM摆脱了这些：

VACUUM tab_v;
SELECT * FROM pgstattuple('tab_v');
-[ RECORD 1 ]------+--------
table_len          | 4431872
tuple_count        | 60000
tuple_len          | 2268895
tuple_percent      | 51.19
dead_tuple_count   | 0
dead_tuple_len     | 0
dead_tuple_percent | 0
free_space         | 1616724
free_percent       | 36.48

死元组不见了，但现在有很多空白空间。

我正在添加到上面提供的@Laurenz Albe 完整答案中。腹胀还有另一种可能性。请考虑以下方案：

您的视图在大多数情况下很少更改(1000000 条记录，每个请求更改 100 条记录(，但您仍然得到 500000 个死元组。其原因在索引列中可能为 null。

如上述答案中所述，当视图同时实现时，将重新创建并比较副本。比较使用强制的唯一索引，但是，空值呢？在 SQL 中，空值永远不会彼此相等。因此，如果主键允许空值，则包含空值的记录即使未更改也将始终重新创建并添加到表中

为了解决这个问题，您可以做的是消除膨胀的是添加额外的列，将 null 列合并为某个从未使用过的值(-1， to_timestamp(0(， ...( 并仅将此列用于主索引

问题

问题

首先，让我们解释一下膨胀

其次，让我们揭穿`VACUUM`无法删除死元组的神话

一个演示所有这些的示例

相关内容

最新更新

热门标签：

同时刷新实例化视图会导致表膨胀

问题

问题

首先，让我们解释一下膨胀

其次，让我们揭穿VACUUM无法删除死元组的神话

一个演示所有这些的示例

相关内容

最新更新

热门标签：

其次，让我们揭穿`VACUUM`无法删除死元组的神话