我尝试在博客中所建议的那样,在虚拟列上实现表达式索引。
它可以按预期工作(但需要表格中的其他数据才能成为有用的索引(。然后,我尝试在视图中使用它并将其打破!
逐步:
CREATE TABLE customer (id SERIAL, firstname TEXT, lastname TEXT);
-- create virtual column
CREATE FUNCTION fullname(customer) RETURNS text AS $$
SELECT $1.firstname || ' ' || $1.lastname
$$ LANGUAGE SQL;
INSERT INTO customer VALUES (DEFAULT, 'Mark', 'Pennypincher');
-- Insert some data for index became useful
INSERT INTO customer(firstname, lastname)
SELECT 'first_' || n, 'last_' || n FROM generate_series(1, 1000) as n;
CREATE INDEX i_customer_firstname ON customer (firstname);
CREATE INDEX i_customer_bothnames ON customer ((firstname || ' ' || lastname));
CREATE INDEX i_customer_fullname ON customer (fullname(customer));
ANALYZE customer;
EXPLAIN SELECT * FROM customer WHERE customer.fullname = 'Mark Pennypincher';
目前它可以按预期工作,查询使用索引:
Index Scan using i_customer_bothnames on customer (cost=0.28..8.29 rows=1 width=21)
Index Cond: (((firstname || ' '::text) || lastname) = 'Mark Pennypincher'::text)
现在创建视图:
CREATE OR REPLACE VIEW v_customer AS
SELECT
c.*
,c.fullname
FROM customer c;
我们没有汇总和限制,因此,根据文档的重写机制和优化器,应在查询中将其提取。但不是!
看:
EXPLAIN SELECT * FROM v_customer WHERE v_customer.fullname = 'Mark Pennypincher';
立即导致下一个查询计划:
Subquery Scan on v_customer (cost=0.00..34.53 rows=5 width=53)
Filter: (v_customer.fullname = 'Mark Pennypincher'::text)
-> Seq Scan on customer c (cost=0.00..22.01 rows=1001 width=53)
为什么Subquery Scan
??
接下来的两个语句也很有趣:
EXPLAIN SELECT * FROM v_customer WHERE v_customer.firstname = 'Mark';
EXPLAIN SELECT * FROM v_customer WHERE v_customer.firstname || ' ' || v_customer.lastname = 'Mark Pennypincher';
都使用他们的索引。
当然,我在几个连接方面的现实情况更为复杂。有问题的例子。这就是为什么我要为简单。
您需要使功能不可变,否则计划者将不会使用它:
CREATE FUNCTION fullname(customer) RETURNS text
AS
$$
SELECT $1.firstname || ' ' || $1.lastname
$$
LANGUAGE SQL
immutable; --<< here
这将使用您的示例数据生成以下计划:
Index Scan using i_customer_fullname on stuff.customer (cost=0.28..8.29 rows=1 width=21) (actual time=0.054..0.054 rows=1 loops=1)
Output: id, firstname, lastname
Index Cond: (((customer.firstname || ' '::text) || customer.lastname) = 'Mark Pennypincher'::text)
Buffers: shared hit=1 read=2
I/O Timings: read=0.035
Planning time: 0.571 ms
Execution time: 0.079 ms
如果该函数不是不可变的,那么Postgres不能依靠存储在索引中的值总是相同的。stable
本来可以实现相同的目标。但是,由于功能仅依赖于传递给它的值,因此immutable
是正确的选择。