大型索引表 (Postgres) 中的快速元组查找



表结构

我有一个 Postgres 表T,(w1, w2, occurrences)3 列,这三个都是整数,w1 和 w2引用另一个表。所以基本上,有两个外键和一个值.
w1w2可以在每行中成为大约 1500 万个唯一值(索引)。目前,T拥有大约 18 亿行,这些行或多或少是索引的排列,但限制是 - 如果您考虑对称矩阵 - 可能只存在一个值对。例如,可能有(252, 13, x),但没有(13, 252, x)。但是,没有排序,因此(5, 900, x)也可能在T中(排序在插入期间完成,取决于引用表中的值)。这些元组(w1、w2)是唯一的。 目前,该表上有3个不同的指数,一个UNIQUE INDEX tdc_idx_1 on T (w1, w2),一个INDEX tdc_idx_w1 on T (w1)和一个INDEX tdc_idx_w2 on T (w2)

问题陈述

基本上,我想运行两个不同的查询。对我来说,困难在于找出糟糕的性能来自哪里,或者在给定查询的"复杂性"和表大小的情况下接受运行时(我猜事实并非如此......简而言之,我可能需要查询结构方面的帮助,并且可能需要处理表索引(或者可能是一般的表设计).
对于我想要的结果,最直接的查询是

-- A (the OR-query)
SELECT w1, w2, occurrences FROM public.T
WHERE (w1 in (123, 555, 999) OR w2 in (123, 555, 999) AND occurrences > 1;

-- B (the AND-query)
SELECT w1, w2, occurrences FROM public.T
WHERE (w1 in (123, 555, 999) AND w2 in (123, 555, 999) AND occurrences > 1;

分别(注意 OR/AND)。

现在,我将给出一些性能度量和一些我使用ANALYZE EXPLAIN提出的替代查询。

性能

查询 A

-- A (OR)
EXPLAIN ANALYSE 
SELECT w1, w2, occurrences
FROM public.T
WHERE (w1 in (123, 555, 999) OR w2 in (123, 555, 999)) AND occurrences > 1

已成功运行。总查询运行时间:29 分 19 秒.
173071 行受到影响。(实际结果,不含解释分析)

Gather  (cost=3378.13..13984524.32 rows=50262 width=12) (actual time=144.749..1751127.665 rows=173071 loops=1)
Workers Planned: 2
Workers Launched: 2
->  Parallel Bitmap Heap Scan on T  (cost=2378.13..13978498.12 rows=20942 width=12) (actual time=154.140..1750873.360 rows=57690 loops=3)
Recheck Cond: ((w1 = ANY ('{123,555,999}'::integer[])) OR (w2 = ANY ('{123,555,999}'::integer[])))
Rows Removed by Index Recheck: 20069348
Filter: (occurences > 1)
Rows Removed by Filter: 110878
Heap Blocks: exact=902 lossy=107950
->  BitmapOr  (cost=2378.13..2378.13 rows=214069 width=0) (actual time=114.320..114.335 rows=0 loops=1)
->  Bitmap Index Scan on tdc_idx_w1  (cost=0.00..1631.03 rows=148439 width=0) (actual time=17.100..17.103 rows=166301 loops=1)
Index Cond: (w1 = ANY ('{123,555,999}'::integer[]))
->  Bitmap Index Scan on tdc_idx_w2  (cost=0.00..721.96 rows=65630 width=0) (actual time=97.208..97.211 rows=339406 loops=1)
Index Cond: (w2 = ANY ('{123,555,999}'::integer[]))
Planning Time: 0.280 ms
JIT:
Functions: 12
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 2.689 ms, Inlining 117.210 ms, Optimization 36.646 ms, Emission 28.227 ms, Total 184.772 ms
Execution Time: 1751571.992 ms

查询 B

-- B (AND)
EXPLAIN ANALYSE 
SELECT w1, w2, occurrences
FROM public.T
WHERE (w1 in (123, 555, 999) AND w2 in (123, 555, 999)) AND occurrences > 1

已成功运行。总查询运行时间:1 秒 716 毫秒.
3 行受到影响。(实际结果,不含解释分析)

Index Scan using tdc_idx_1 on T  (cost=0.58..61.13 rows=1 width=12) (actual time=88.895..239.394 rows=3 loops=1)
Index Cond: ((w1 = ANY ('{123,555,999}'::integer[])) AND (w2 = ANY ('{123,555,999}'::integer[])))
Filter: (occurences > 1)
Planning Time: 51.256 ms
Execution Time: 239.518 ms

查询备选方案 1

-- A1 (OR)
EXPLAIN ANALYSE 
SELECT w1, w2, occurences
FROM public.token_doc_cooccurrences
INNER JOIN (
VALUES (123), (555), (999)
) vals(v)
ON (w1 = v)
WHERE occurences > 1
UNION
SELECT w1, w2, occurences
FROM public.token_doc_cooccurrences
INNER JOIN (
VALUES (123), (555), (999)
) vals2(w)
ON (w2 = w)
WHERE occurences > 1;

已成功运行。总查询运行时间:29 分 28 秒.
173071 行受到影响。(实际结果,不含解释分析)

HashAggregate  (cost=841679.67..842197.01 rows=51734 width=12) (actual time=1755066.279..1755315.668 rows=173071 loops=1)
Group Key: T.w1, T.w2, T.occurences
Batches: 5  Memory Usage: 4145kB  Disk Usage: 6920kB
->  Append  (cost=561.79..841291.66 rows=51734 width=12) (actual time=1088.188..1753999.477 rows=173074 loops=1)
->  Nested Loop  (cost=561.79..579176.31 rows=35898 width=12) (actual time=1088.181..75981.142 rows=55822 loops=1)
->  Values Scan on *VALUES*  (cost=0.00..0.04 rows=3 width=4) (actual time=0.006..0.032 rows=3 loops=1)
->  Bitmap Heap Scan on T  (cost=561.79..192939.10 rows=11966 width=12) (actual time=407.806..25231.892 rows=18607 loops=3)
Recheck Cond: (w1 = *VALUES*.column1)
Filter: (occurences > 1)
Rows Removed by Filter: 36826
Heap Blocks: exact=10630
->  Bitmap Index Scan on tdc_idx_w1  (cost=0.00..558.80 rows=50963 width=0) (actual time=74.175..74.175 rows=55434 loops=3)
Index Cond: (w1 = *VALUES*.column1)
->  Nested Loop  (cost=250.51..261339.35 rows=15836 width=12) (actual time=217.234..1677133.291 rows=117252 loops=1)
->  Values Scan on *VALUES*_1  (cost=0.00..0.04 rows=3 width=4) (actual time=0.020..0.042 rows=3 loops=1)
->  Bitmap Heap Scan on T T_1  (cost=250.51..87060.31 rows=5279 width=12) (actual time=143.454..558815.229 rows=39084 loops=3)
Recheck Cond: (w2 = *VALUES*_1.column1)
Rows Removed by Index Recheck: 19099252
Filter: (occurences > 1)
Rows Removed by Filter: 74051
Heap Blocks: exact=18304 lossy=311451
->  Bitmap Index Scan on tdc_idx_w2  (cost=0.00..249.19 rows=22482 width=0) (actual time=122.725..122.725 rows=113135 loops=3)
Index Cond: (w2 = *VALUES*_1.column1)
Planning Time: 68.676 ms
JIT:
Functions: 21
Options: Inlining true, Optimization true, Expressions true, Deforming true
Timing: Generation 5.749 ms, Inlining 219.374 ms, Optimization 370.011 ms, Emission 360.728 ms, Total 955.862 ms
Execution Time: 1756917.366 ms

查询 B 备选方案 1

-- B1 (AND)
EXPLAIN ANALYSE 
SELECT w1, w2, occurences
FROM public.token_doc_cooccurrences
INNER JOIN (
VALUES (123), (555), (999)
) vals(v)
ON (w1 = v)
INNER JOIN (
VALUES (123), (555), (999)
) vals2(w)
ON (w2 = w)
WHERE occurences > 1;

已成功运行。总查询运行时间:1 秒 5 毫秒.
3 行受到影响。(实际结果,不含解释分析)

Nested Loop  (cost=0.58..77.73 rows=1 width=12) (actual time=130.157..295.939 rows=3 loops=1)
->  Values Scan on ""*VALUES*_1""  (cost=0.00..0.04 rows=3 width=4) (actual time=0.005..0.019 rows=3 loops=1)
->  Nested Loop  (cost=0.58..25.87 rows=3 width=12) (actual time=54.355..98.622 rows=1 loops=3)
->  Values Scan on ""*VALUES*""  (cost=0.00..0.04 rows=3 width=4) (actual time=0.003..0.017 rows=3 loops=3)
->  Index Scan using tdc_idx_1 on T  (cost=0.58..8.60 rows=1 width=12) (actual time=32.853..32.855 rows=0 loops=9)
Index Cond: ((w1 = ""*VALUES*"".column1) AND (w2 = ""*VALUES*_1"".column1))
Filter: (occurences > 1)
Planning Time: 59.447 ms
Execution Time: 296.042 ms

Info
在所有测量之间,我这样做是为了清除缓存的查询:

systemctl stop postgresql
sync
echo 3 > /proc/sys/vm/drop_caches
systemctl start postgresql

== 编辑 ==

按照@jjanes的建议,我添加了其他索引(w1, occurrences, w2)(w2, occurrences, w1)并运行VACUUM.
此外,我设置了work_mem = 16MB.之后,我再次清除缓存并再次运行查询替代方案 1。结果如下:

HashAggregate  (cost=3535.61..4052.95 rows=51734 width=12) (actual time=3240.753..3471.160 rows=173071 loops=1)
Group Key: token_doc_cooccurrences.w1, token_doc_cooccurrences.w2, token_doc_cooccurrences.occurences
Batches: 1  Memory Usage: 14353kB
Buffers: shared hit=152549 read=993
->  Append  (cost=0.58..3147.60 rows=51734 width=12) (actual time=196.966..2927.069 rows=173074 loops=1)
Buffers: shared hit=152549 read=993
->  Nested Loop  (cost=0.58..1642.71 rows=35898 width=12) (actual time=196.960..1948.256 rows=55822 loops=1)
Buffers: shared hit=46960 read=533
->  Values Scan on ""*VALUES*""  (cost=0.00..0.04 rows=3 width=4) (actual time=0.007..0.024 rows=3 loops=1)
->  Index Only Scan using w1_occ_w2 on token_doc_cooccurrences  (cost=0.58..427.90 rows=11966 width=12) (actual time=85.026..603.453 rows=18607 loops=3)
Index Cond: ((w1 = ""*VALUES*"".column1) AND (occurences > 1))
Heap Fetches: 0
Buffers: shared hit=46960 read=533
->  Nested Loop  (cost=0.58..728.88 rows=15836 width=12) (actual time=40.764..573.493 rows=117252 loops=1)
Buffers: shared hit=105589 read=460
->  Values Scan on ""*VALUES*_1""  (cost=0.00..0.04 rows=3 width=4) (actual time=0.008..0.027 rows=3 loops=1)
->  Index Only Scan using w2_occ_w1 on token_doc_cooccurrences token_doc_cooccurrences_1  (cost=0.58..190.16 rows=5279 width=12) (actual time=30.809..99.922 rows=39084 loops=3)
Index Cond: ((w2 = ""*VALUES*_1"".column1) AND (occurences > 1))
Heap Fetches: 0
Buffers: shared hit=105589 read=460
Planning:
Buffers: shared hit=121 read=13
Planning Time: 165.009 ms
Execution Time: 3672.053 ms

就个人而言,我无法单独判断设置或仅索引扫描有什么效果,但结果很棒!

Successfully run. Total query runtime: 6 secs 122 msec.
173071 rows affected.

~6 秒,而近 30 分钟。

Rows Removed by Index Recheck: 20069348
Heap Blocks: exact=902 lossy=107950

对于这种性质的查询来说,您的work_mem太低了。 这会导致位图"有损",这意味着它只记住哪些块保存感兴趣的行,而不是块中的哪些行。 这就是为什么它必须重新检查 107950 个块中每个块中的所有行,甚至认为它曾经"知道"其中大多数不符合条件。 所以增加work_mem. 但我敢打赌,大部分时间是在执行IO以读取这些块,而不是进行重新检查的CPU时间。 因此,虽然增加work_mem是个好主意,但它可能不会是魔法。 您可以尝试通过增加 IOeffective_io_concurrency来改进 IO,以便一次可以有多个 IO 请求未完成。 其效果取决于存储系统和驱动程序。

你应该set track_io_timing = on,并用EXPLAIN (ANALYZE, BUFFERS)做你的计划。 这样我们就不必猜测时间花在IO上,您实际上会得到测量结果。

对于查询 A 的替代公式,应该通过在(w1, occurrences, w2)(w2, occurrences, w1)上构建索引来大大改进这一点。 这样,您可以获得仅索引扫描,这将生成更少的随机IO,并且还可以从使用索引来满足occurrences > 1的限制中受益。 如果将"出现次数"放在最后而不是中间列,则生成的索引对于其他查询也可能更通用,同时仍符合仅索引扫描的条件,但无法有效地筛选occurrences > 1。 但是,您确实需要对表进行良好的真空处理,以便仅索引扫描有效。

一种完全不同的方法是购买更好的存储。 您目前似乎有旋转驱动器,而且速度相当慢。 切换到固态驱动器应该会给您带来巨大的改进。

最新更新