localtimestamp vs current_timestamp:为什么这些相似的查询执行时间如此不同?<



经过各种测试,我发现这些相似的查询在执行时间上存在巨大差异:

select id from my_table where creation_date > (current_timestamp - interval '2 days');  
-- 9s Plan: partition pruning parallel scan of 5 partitions
select id from my_table where creation_date > (select current_timestamp - interval '2 days');
-- 4.5s Plan: seq scan of all partitions
select id from my_table where creation_date > (localtimestamp - interval '2 days');     
-- 7s Plan: partition pruning parallel scan of the 5 last partitions
select id from my_table where creation_date > (select localtimestamp - interval '2 days'); 
-- 2.5s Plan: seq scan of all partitions
select id from my_table where creation_date > '2022-11-23 11:20'; 
-- 2.5s Plan: partition pruning parallel scan of the 5 last partitions

create_date是时间戳类型的分区键(不带时区)。我猜是类型转换的问题。当where子句中有select时,计划将扫描所有分区。它表现得好像无法猜测选择的结果。但这比没有select要快得多。当没有select时,计划会删除不需要的分区,只对剩余的分区执行seqscan,但这要慢得多。

然后我在同一个表上启动相同的查询,但是通过来自另一个数据库的外表(在相同的测试实例上)

select id from my_foreign_table where creation_date > (current_timestamp - interval '2 days'); 
-- too long (more than 10min)
select id from my_foreign_table where creation_date > (select current_timestamp - interval '2 days');
-- too long (more than 10min)
select id from my_foreign_table where creation_date > (localtimestamp - interval '2 days');
-- too long (more than 10min)
select id from my_foreign_table where creation_date > (select localtimestamp - interval '2 days');
-- 2.5s 
select id from my_foreign_table where creation_date > '2022-11-23 11:20'; 
-- 2.6s 

有人能解释一下为什么这些查询之间有这么大的差异吗?下面发生了什么?

where子句中没有select的查询执行计划(分区修剪,并行扫描):

QUERY PLAN                                                                                                     |
---------------------------------------------------------------------------------------------------------------+
Gather  (cost=1000.00..62310996.53 rows=1244782 width=8)                                                       |
Workers Planned: 8                                                                                           |
->  Parallel Append  (cost=0.00..62185518.33 rows=155598 width=8)                                            |
Subplans Removed: 175                                                                                  |
->  Parallel Seq Scan on data_container_archive_2022_11 dca  (cost=0.00..820761.36 rows=169177 width=8)|
Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |
->  Parallel Seq Scan on data_container_archive_2022_12 dca_1  (cost=0.00..12.88 rows=55 width=8)      |
Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |
->  Parallel Seq Scan on data_container_archive_2023_01 dca_2  (cost=0.00..12.88 rows=55 width=8)      |
Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |
->  Parallel Seq Scan on data_container_archive_2023_02 dca_3  (cost=0.00..12.88 rows=55 width=8)      |
Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |
->  Parallel Seq Scan on data_container_archive_2023_03 dca_4  (cost=0.00..12.88 rows=55 width=8)      |
Filter: (partition_key > (CURRENT_TIMESTAMP - '2 days'::interval))                               |

使用select in where子句(所有分区,seq scan)查询的执行计划:

QUERY PLAN                                                                                             |
-------------------------------------------------------------------------------------------------------+
Append  (cost=0.01..88407378.48 rows=775958175 width=8)                                                |
InitPlan 1 (returns $0)                                                                              |
->  Result  (cost=0.00..0.01 rows=1 width=8)                                                       |
->  Seq Scan on data_container_archive_2008_04 dca  (cost=0.00..1427.44 rows=13078 width=8)          |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2008_05 dca_1  (cost=0.00..1392.58 rows=12895 width=8)        |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2008_06 dca_2  (cost=0.00..376.65 rows=3457 width=8)          |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2008_07 dca_3  (cost=0.00..1638.68 rows=14685 width=8)        |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2008_08 dca_4  (cost=0.00..2127.53 rows=19001 width=8)        |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2008_09 dca_5  (cost=0.00..3121.90 rows=27971 width=8)        |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2008_10 dca_6  (cost=0.00..3147.60 rows=28576 width=8)        |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2008_11 dca_7  (cost=0.00..4151.62 rows=37297 width=8)        |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2008_12 dca_8  (cost=0.00..5287.16 rows=47551 width=8)        |
Filter: (partition_key > $0)                                                                   |
->  Seq Scan on data_container_archive_2009_01 dca_9  (cost=0.00..7169.21 rows=64752 width=8)        |
Filter: (partition_key > $0)                                                                   |
... 

注意:这里的表名和列名都是真实的。

在与子查询结果比较的查询中,PostgreSQL不会" pull up ";子查询,因此在执行开始时它不知道子查询的结果。因此,它不会修剪分区。

是的,PostgreSQL优化器可能会变得更聪明,并认识到它可以拉出子查询,但这将花费额外的处理时间,这将损害其他查询。

总之:

localtimestamp是系统时钟值(因此没有计算)和

current_timestamp是时区感知对象,这需要查找操作系统的时区设置(在罚款)。

这是一篇很好的文章:https://www.commandprompt.com/education/postgresql-current_timestamp-vs-localtimestamp/

相关内容

  • 没有找到相关文章

最新更新