Postgresql 索引不用于慢速查询>30 秒



POSTGRESQL版本:10硬件:4名工人/16GBRAM/50%使用

我不是Postgresql专家。我刚刚读了很多文档,做了很多测试。我有一些postgresql查询,这些查询花费了很多时间>30年代,因为一张表上有1000万行。

Column            |           Type           | Collation | Nullable |                         Default                          
------------------------------+--------------------------+-----------+----------+----------------------------------------------------------
id                           | integer                  |           | not null | 
cveid                        | character varying(50)    |           |          | 
summary                      | text                     |           | not null | 
published                    | timestamp with time zone |           |          | 
modified                     | timestamp with time zone |           |          | 
assigner                     | character varying(128)   |           |          | 
vulnerable_products          | character varying(250)[] |           |          | 
cvss                         | double precision         |           |          | 
cvss_time                    | timestamp with time zone |           |          | 
cvss_vector                  | character varying(250)   |           |          | 
access                       | jsonb                    |           | not null | 
impact                       | jsonb                    |           | not null | 
score                        | integer                  |           | not null | 
is_exploitable               | boolean                  |           | not null | 
is_confirmed                 | boolean                  |           | not null | 
is_in_the_news               | boolean                  |           | not null | 
is_in_the_wild               | boolean                  |           | not null | 
reflinks                     | jsonb                    |           | not null | 
reflinkids                   | jsonb                    |           | not null | 
created_at                   | timestamp with time zone |           |          | 
history_id                   | integer                  |           | not null | nextval('vulns_historicalvuln_history_id_seq'::regclass)
history_date                 | timestamp with time zone |           | not null | 
history_change_reason        | character varying(100)   |           |          | 
history_type                 | character varying(1)     |           | not null |
Indexes:
"vulns_historicalvuln_pkey" PRIMARY KEY, btree (history_id)
"btree_varchar" btree (history_type varchar_pattern_ops)
"vulns_historicalvuln_cve_id_850876bb" btree (cve_id)
"vulns_historicalvuln_cwe_id_2013d697" btree (cwe_id)
"vulns_historicalvuln_history_user_id_9e25ebf5" btree (history_user_id)
"vulns_historicalvuln_id_773f2af7" btree (id)
--- TRUNCATE
Foreign-key constraints:
"vulns_historicalvuln_history_user_id_9e25ebf5_fk_custusers" FOREIGN KEY (history_user_id) REFERENCES custusers_user(id) DEFERRABLE INITIALLY DEFERRED

查询示例:选择*FROM vulns_historicalvuln WHERE history_type<gt;'+'order by id desc仅获取前10000行->30s无缓存

查询计划

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=0.43..31878.33 rows=10000 width=1736) (actual time=0.173..32839.474 rows=10000 loops=1)
->  Index Scan Backward using vulns_historicalvuln_id_773f2af7 on vulns_historicalvuln  (cost=0.43..26346955.92 rows=8264960 width=1736) (actual time=0.172..32830.958 rows=10000 loops=1)
Filter: ((history_type)::text <> '+'::text)
Rows Removed by Filter: 296
Planning time: 19.514 ms
Execution time: 32845.015 ms

SELECT DISTINCT";vulns"id"vulns"uuid"vulns"feedid"vulns"cve_ id"vulns"cveid"vulns""概述"vulns"公开"vulns""修改"vulns""分配器"vulns"cwe_ id"vulns"vulnerable_packages_versions"vulns"脆弱_产品"vulns"vulnerable_product_versions"vulns"cvss"vulns""cvss_time"vulns"cvss_ version"vulns""cvss_vector"vulns""cvss_ metrics"vulns"访问"vulns"冲击"vulns"cvss3"vulns"cvss3_ vector"vulns"cvss3_版本"vulns"cvss3_ metrics"vulns"得分"vulns"is_exploitable"vulns"is_confirmed"vulns"is_in_the_news"vulns"is_in_the_wild"vulns"reflinks"vulns"reflinkids"vulns"created_ at"vulns"updated_ at"vulns"id";AS";剥削计数";,假AS";被监视"42’AS";org";FROM";vulns";其中("vulns"."score">=0 AND"vulns"."score"<=100(ORDER BY";vulns"updated_ at";DESC极限10


查询计划

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit  (cost=315191.32..315192.17 rows=10 width=1691) (actual time=3013.964..3013.990 rows=10 loops=1)
->  Unique  (cost=315191.32..329642.42 rows=170013 width=1691) (actual time=3013.962..3013.986 rows=10 loops=1)
->  Sort  (cost=315191.32..315616.35 rows=170013 width=1691) (actual time=3013.961..3013.970 rows=10 loops=1)
Sort Key: updated_at DESC, id, uuid, feedid, cve_id, cveid, summary, published, modified, assigner, cwe_id, vulnerable_packages_versions, vulnerable_products, vulnerable_product_versions, cvss, cvss_time, cvss_version, cvss_vector, cvss_metrics, access, impact, cvss3, cvss3_vector, cvss3_version, cvss3_metrics, score, is_exploitable, is_confirmed, is_in_the_news, is_in_the_wild, reflinks, reflinkids, created_at
Sort Method: external merge  Disk: 277648kB
->  Seq Scan on vulns  (cost=0.00..50542.19 rows=170013 width=1691) (actual time=0.044..836.597 rows=169846 loops=1)
Filter: ((score >= 0) AND (score <= 100))
Planning time: 3.183 ms
Execution time: 3070.346 ms

我创建了一个btree varchar索引btree_varchar" btree (history_type varchar_pattern_ops),如下所示:在漏洞上同时创建索引btree_varchar(history_type varchar_pattern_ops(;

我还为我的第二个查询创建了一个vulns分数索引:在vulns上同时创建索引(score(;

我读了很多关于慢速查询和索引的文章和文档。我确信这是关于慢速查询的解决方案,但Postgresql的查询计划没有使用我创建的索引。它估计使用seq扫描比使用索引处理得更快。。。

SELECT relname, indexrelname, idx_scan FROM   pg_catalog.pg_stat_user_indexes;
relname               |                          indexrelname                           |  idx_scan  
-------------------------------------+-----------------------------------------------------------------+------------
vulns_historicalvuln                | btree_varchar                                                   |          0

你能告诉我我的索引设计得好吗?我如何调试这个,如果需要,可以随时询问更多信息。

感谢

经过一些研究,我明白索引并不能解决我的问题。此字段的低有序性(重复值(使索引变得无用。

这里的查询postgresql的时间是正常的,因为匹配了30M行。我结束这个问题是因为这里的索引没有问题。

最新更新