假设下面的查询-表,列和键应该非常明显(否则请询问)。
SELECT DISTINCT p.IDProduct
FROM Catalog.Catalog c
INNER JOIN Catalog.Product p ON (
p.FKIDCatalog=c.IDCatalog
)
INNER JOIN Catalog.ProductLanguage pl ON (
pl.FKIDProduct=p.IDProduct
AND (
pl.FKIDLanguage='de_DE'
OR pl.FKIDLanguage=c.FKIDLanguage
)
)
WHERE to_tsvector(SearchConfig, COALESCE(pl.DescriptionShort, '') || ' ' || COALESCE(pl.DescriptionLong, '') || ' ' || COALESCE(pl.KeywordList, '')) @@ to_tsquery('''vorschlaghammer'':*')
AND c.IDCatalog IN (5, 24, 6, 7, 11, 12, 8, 1, 23)
in子句由用户许可决定,并创建一个包含约130万个产品(从2M中)的搜索空间,其中包含181个点击——这是一个非常典型的用例。不幸的是,它需要49秒才能返回结果。EXPLAIN (analyze, buffers, format text)
显示如下查询计划:
Unique (cost=59887.83..59887.89 rows=13 width=4) (actual time=48934.329..48972.548 rows=181 loops=1)
Buffers: shared hit=5386635
-> Sort (cost=59887.83..59887.86 rows=13 width=4) (actual time=48934.328..48972.520 rows=181 loops=1)
Sort Key: p.idproduct
Sort Method: quicksort Memory: 33kB
Buffers: shared hit=5386635
-> Gather (cost=1045.52..59887.59 rows=13 width=4) (actual time=908.689..48972.460 rows=181 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=5386635
-> Nested Loop (cost=45.52..58886.29 rows=5 width=4) (actual time=3215.182..48926.270 rows=60 loops=3)
Join Filter: (((pl.fkidlanguage)::text = 'de_DE'::text) OR ((pl.fkidlanguage)::text = (c.fkidlanguage)::text))
Buffers: shared hit=5386635
-> Hash Join (cost=45.09..57038.74 rows=1319 width=10) (actual time=0.167..249.085 rows=438115 loops=3)
Hash Cond: (p.fkidcatalog = c.idcatalog)
Buffers: shared hit=44799
-> Parallel Seq Scan on product p (cost=0.00..54420.03 rows=979803 width=8) (actual time=0.015..66.259 rows=783365 loops=3)
Buffers: shared hit=44622
-> Hash (cost=44.98..44.98 rows=9 width=10) (actual time=0.075..0.076 rows=9 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=77
-> Index Scan using catalog_pkey on catalog c (cost=0.28..44.98 rows=9 width=10) (actual time=0.033..0.068 rows=9 loops=3)
Index Cond: (idcatalog = ANY ('{5,24,6,7,11,12,8,1,23}'::integer[]))
Buffers: shared hit=77
-> Index Scan using productlanguage_pkey on productlanguage pl (cost=0.43..1.39 rows=1 width=10) (actual time=0.111..0.111 rows=0 loops=1314345)
Index Cond: (fkidproduct = p.idproduct)
Filter: (to_tsvector(searchconfig, (((((COALESCE(descriptionshort, ''::character varying))::text || ' '::text) || COALESCE(descriptionlong, ''::text)) || ' '::text) || COALESCE(keywordlist, ''::text))) @@ to_tsquery('''vorschlaghammer'':*'::text))
Rows Removed by Filter: 1
Buffers: shared hit=5341836
Planning:
Buffers: shared hit=65
Planning Time: 1.905 ms
Execution Time: 48972.635 ms
(33 rows)
我不太熟悉执行计划,但我想说,首先获得130万个产品,然后遍历所有产品以检查全文条件是不明智的;当然,如果我缩小目录集,查询时间就会减少,反之亦然。如果将in子句替换为例如AND c.IDCatalog<29
(它选择所有主要目录),查询优化器就会做我最初期望它做的事情(可能是因为它必须考虑"几乎所有")。产品):
Unique (cost=63069.02..63073.42 rows=37 width=4) (actual time=36.778..39.404 rows=265 loops=1)
Buffers: shared hit=1395
-> Gather Merge (cost=63069.02..63073.33 rows=37 width=4) (actual time=36.777..39.360 rows=265 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared hit=1395
-> Sort (cost=62068.99..62069.03 rows=15 width=4) (actual time=1.269..1.277 rows=88 loops=3)
Sort Key: p.idproduct
Sort Method: quicksort Memory: 37kB
Buffers: shared hit=1395
Worker 0: Sort Method: quicksort Memory: 25kB
Worker 1: Sort Method: quicksort Memory: 25kB
-> Hash Join (cost=320.56..62068.70 rows=15 width=4) (actual time=0.926..1.229 rows=88 loops=3)
Hash Cond: (p.fkidcatalog = c.idcatalog)
Join Filter: (((pl.fkidlanguage)::text = 'de_DE'::text) OR ((pl.fkidlanguage)::text = (c.fkidlanguage)::text))
Buffers: shared hit=1381
-> Nested Loop (cost=294.26..62031.43 rows=4171 width=14) (actual time=0.761..1.039 rows=88 loops=3)
Buffers: shared hit=1240
-> Parallel Bitmap Heap Scan on productlanguage pl (cost=293.83..35768.94 rows=4171 width=10) (actual time=0.756..0.819 rows=88 loops=3)
Recheck Cond: (to_tsvector(searchconfig, (((((COALESCE(descriptionshort, ''::character varying))::text || ' '::text) || COALESCE(descriptionlong, ''::text)) || ' '::text) || COALESCE(keywordlist, ''::text))) @@ to_tsquery('''vorschlaghammer'':*'::text))
Heap Blocks: exact=133
Buffers: shared hit=180
-> Bitmap Index Scan on productlanguage_descriptionshort_descriptionlong_keywordlist (cost=0.00..291.33 rows=10010 width=0) (actual time=2.208..2.209 rows=265 loops=1)
Index Cond: (to_tsvector(searchconfig, (((((COALESCE(descriptionshort, ''::character varying))::text || ' '::text) || COALESCE(descriptionlong, ''::text)) || ' '::text) || COALESCE(keywordlist, ''::text))) @@ to_tsquery('''vorschlaghammer'':*'::text))
Buffers: shared hit=47
-> Index Scan using product_pkey on product p (cost=0.43..6.30 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=265)
Index Cond: (idproduct = pl.fkidproduct)
Buffers: shared hit=1060
-> Hash (cost=25.99..25.99 rows=25 width=10) (actual time=0.097..0.098 rows=21 loops=3)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
Buffers: shared hit=41
-> Index Scan using catalog_pkey on catalog c (cost=0.28..25.99 rows=25 width=10) (actual time=0.036..0.085 rows=21 loops=3)
Index Cond: (idcatalog < 29)
Buffers: shared hit=41
Planning:
Buffers: shared hit=68
Planning Time: 1.903 ms
Execution Time: 39.517 ms
(38 rows)
这要快3个数量级,我希望PostgreSQL能够在几毫秒内过滤265行结果,以添加原始的in子句。
当然,PostgreSQL只能猜测该走哪条路,但如果它做出了一个错误的决定,那将是非常令人不满意的。实际上,49秒的响应时间对于我的用户来说是完全不能接受的,而40毫秒的响应时间几乎不值得注意。我从来没有在非全文查询中遇到过类似的情况。
所以可能有两个问题:A)如何修复/解决这个特定的用例B)如何与全文查询一般在性能方面的工作?
问题的主要根源似乎是您的"产品"的散列连接。";catalog"被错误估计了300多倍。这与FTS无关。所以我想说,这可能只是运气,你遇到了这个问题与FTS查询,而不是其他查询。
PostgreSQL会同意首先获得130万个产品不是一个好主意,但它认为它需要获得大约4000个(1319*3)个产品。
为什么呢?最后是p.FKIDCatalog=c.IDCatalog and c.IDCatalog IN (5, 24, 6, 7, 11, 12, 8, 1, 23)
。它通过计算FKIDCatalog的每个值平均匹配p的行数乘以9来估计这一点。但是你列出的9个具体值并不是平均值,而是非常普遍的值。如果您将其写成p.FKIDCatalog=c.IDCatalog and p.FKIDCatalog IN (5, 24, 6, 7, 11, 12, 8, 1, 23)
,那么它将估计它期望为这9个特定值中的每一个找到的行数,并将它们相加。
通常PostgreSQL正确地估计了相等的传递属性,也就是说,如果你把它写为p.FKIDCatalog=c.IDCatalog and c.IDCatalog=5
,它知道它可以得到p.FKIDCatalog=5
的特定估计并使用它。但是对于IN-list的传递性,它不能做同样的事情(除非IN-list只有一项长,那么它会重写为简单相等并应用传递律),尽管在概念上它可以。
我还注意到,在您的另一个计划中可见的全文索引的估计值也相当糟糕,期望4171行,但只找到88行。我不知道为什么这很糟糕,在我看来,tv @@ tq通常比这更好估计(至少当tq包含一个单独的项时)。这张表最近分析过吗?至少自从添加了表达指数之后?
单独修复其中任何一个可能足以将计划转移到更快的计划。