为什么Postgres虚拟柱破坏了拉的状态并强制子查询扫描

我尝试在博客中所建议的那样，在虚拟列上实现表达式索引。

它可以按预期工作(但需要表格中的其他数据才能成为有用的索引(。然后，我尝试在视图中使用它并将其打破！

逐步：

CREATE TABLE customer (id SERIAL, firstname TEXT, lastname TEXT);
-- create virtual column
CREATE FUNCTION fullname(customer) RETURNS text AS $$
    SELECT $1.firstname || ' ' || $1.lastname
$$ LANGUAGE SQL;
INSERT INTO customer VALUES (DEFAULT, 'Mark', 'Pennypincher');
-- Insert some data for index became useful
INSERT INTO customer(firstname, lastname)
SELECT 'first_' || n, 'last_' || n FROM generate_series(1, 1000) as n;
CREATE INDEX i_customer_firstname ON customer (firstname);
CREATE INDEX i_customer_bothnames ON customer ((firstname || ' ' || lastname));
CREATE INDEX i_customer_fullname ON customer (fullname(customer));
ANALYZE customer;
EXPLAIN SELECT * FROM customer WHERE customer.fullname = 'Mark Pennypincher';

目前它可以按预期工作，查询使用索引：

 Index Scan using i_customer_bothnames on customer  (cost=0.28..8.29 rows=1 width=21) 
   Index Cond: (((firstname || ' '::text) || lastname) = 'Mark Pennypincher'::text)

现在创建视图：

CREATE OR REPLACE VIEW v_customer AS
    SELECT
        c.*
        ,c.fullname
    FROM customer c;

我们没有汇总和限制，因此，根据文档的重写机制和优化器，应在查询中将其提取。但不是！

看：

EXPLAIN SELECT * FROM v_customer WHERE v_customer.fullname = 'Mark Pennypincher';

立即导致下一个查询计划：

 Subquery Scan on v_customer  (cost=0.00..34.53 rows=5 width=53)     
   Filter: (v_customer.fullname = 'Mark Pennypincher'::text)         
   ->  Seq Scan on customer c  (cost=0.00..22.01 rows=1001 width=53)

为什么Subquery Scan ??

接下来的两个语句也很有趣：

EXPLAIN SELECT * FROM v_customer WHERE v_customer.firstname = 'Mark';
EXPLAIN SELECT * FROM v_customer WHERE v_customer.firstname || ' ' || v_customer.lastname = 'Mark Pennypincher';

都使用他们的索引。

当然，我在几个连接方面的现实情况更为复杂。有问题的例子。这就是为什么我要为简单。

替换它的原因

您需要使功能不可变，否则计划者将不会使用它：

CREATE FUNCTION fullname(customer) RETURNS text 
AS 
$$
    SELECT $1.firstname || ' ' || $1.lastname
$$ 
LANGUAGE SQL 
immutable; --<< here

这将使用您的示例数据生成以下计划：

Index Scan using i_customer_fullname on stuff.customer  (cost=0.28..8.29 rows=1 width=21) (actual time=0.054..0.054 rows=1 loops=1)
  Output: id, firstname, lastname                                                                                                  
  Index Cond: (((customer.firstname || ' '::text) || customer.lastname) = 'Mark Pennypincher'::text)                               
  Buffers: shared hit=1 read=2                                                                                                     
  I/O Timings: read=0.035                                                                                                          
Planning time: 0.571 ms                                                                                                            
Execution time: 0.079 ms

如果该函数不是不可变的，那么Postgres不能依靠存储在索引中的值总是相同的。stable本来可以实现相同的目标。但是，由于功能仅依赖于传递给它的值，因此immutable是正确的选择。

相关内容

最新更新

热门标签：