为什么Postgres虚拟柱破坏了拉的状态并强制子查询扫描



我尝试在博客中所建议的那样,在虚拟列上实现表达式索引。

它可以按预期工作(但需要表格中的其他数据才能成为有用的索引(。然后,我尝试在视图中使用它并将其打破!

逐步:

CREATE TABLE customer (id SERIAL, firstname TEXT, lastname TEXT);
-- create virtual column
CREATE FUNCTION fullname(customer) RETURNS text AS $$
    SELECT $1.firstname || ' ' || $1.lastname
$$ LANGUAGE SQL;
INSERT INTO customer VALUES (DEFAULT, 'Mark', 'Pennypincher');
-- Insert some data for index became useful
INSERT INTO customer(firstname, lastname)
SELECT 'first_' || n, 'last_' || n FROM generate_series(1, 1000) as n;
CREATE INDEX i_customer_firstname ON customer (firstname);
CREATE INDEX i_customer_bothnames ON customer ((firstname || ' ' || lastname));
CREATE INDEX i_customer_fullname ON customer (fullname(customer));
ANALYZE customer;
EXPLAIN SELECT * FROM customer WHERE customer.fullname = 'Mark Pennypincher';

目前它可以按预期工作,查询使用索引:

 Index Scan using i_customer_bothnames on customer  (cost=0.28..8.29 rows=1 width=21) 
   Index Cond: (((firstname || ' '::text) || lastname) = 'Mark Pennypincher'::text)

现在创建视图:

CREATE OR REPLACE VIEW v_customer AS
    SELECT
        c.*
        ,c.fullname
    FROM customer c;

我们没有汇总和限制,因此,根据文档的重写机制和优化器,应在查询中将其提取。但不是!

看:

EXPLAIN SELECT * FROM v_customer WHERE v_customer.fullname = 'Mark Pennypincher';

立即导致下一个查询计划:

 Subquery Scan on v_customer  (cost=0.00..34.53 rows=5 width=53)     
   Filter: (v_customer.fullname = 'Mark Pennypincher'::text)         
   ->  Seq Scan on customer c  (cost=0.00..22.01 rows=1001 width=53)

为什么Subquery Scan ??

接下来的两个语句也很有趣:

EXPLAIN SELECT * FROM v_customer WHERE v_customer.firstname = 'Mark';
EXPLAIN SELECT * FROM v_customer WHERE v_customer.firstname || ' ' || v_customer.lastname = 'Mark Pennypincher';

都使用他们的索引。

当然,我在几个连接方面的现实情况更为复杂。有问题的例子。这就是为什么我要为简单

替换它的原因

您需要使功能不可变,否则计划者将不会使用它:

CREATE FUNCTION fullname(customer) RETURNS text 
AS 
$$
    SELECT $1.firstname || ' ' || $1.lastname
$$ 
LANGUAGE SQL 
immutable; --<< here

这将使用您的示例数据生成以下计划:

Index Scan using i_customer_fullname on stuff.customer  (cost=0.28..8.29 rows=1 width=21) (actual time=0.054..0.054 rows=1 loops=1)
  Output: id, firstname, lastname                                                                                                  
  Index Cond: (((customer.firstname || ' '::text) || customer.lastname) = 'Mark Pennypincher'::text)                               
  Buffers: shared hit=1 read=2                                                                                                     
  I/O Timings: read=0.035                                                                                                          
Planning time: 0.571 ms                                                                                                            
Execution time: 0.079 ms                                                                                                           

如果该函数不是不可变的,那么Postgres不能依靠存储在索引中的值总是相同的。stable本来可以实现相同的目标。但是,由于功能依赖于传递给它的值,因此immutable是正确的选择。

最新更新