我有一个较慢的Postgres查询,可能需要更好的索引。但我缺乏索引的经验,所以我只是索引外键。我使用Postgres14.3和Django3.0。
这是一个缓慢的查询。只有当聊天室里有很多信息(比如10k以上(时,它才会很慢。
SELECT
chat_message.info_date_str,
auth_user.username,
chat_message.id,
chat_message.text_msg,
chat_message.type_message,
chat_message.url_file,
chat_message.url_file,
chat_message.was_removed,
chat_message.was_removed
FROM chat_message
INNER JOIN user_profile_userprofile
ON chat_message.owner_id = user_profile_userprofile.id
INNER JOIN auth_user
ON user_profile_userprofile.user_id = auth_user.id
WHERE chat_message.chat_room_id = 7204712
AND NOT chat_message.delete_by::text LIKE '%fattimita08%'
ORDER BY chat_message.id DESC
LIMIT 20
持续时间:11956.422
相关表格的定义:
CREATE TABLE public.chat_message (
id integer NOT NULL DEFAULT nextval('chat_message_id_seq'::regclass)
, chat_room_id integer NOT NULL
, owner_id integer NOT NULL
, info_date timestamp with time zone NOT NULL
, aud_msg character varying(500) NOT NULL
, vid_msg character varying(500) NOT NULL
, img_msg character varying(500) NOT NULL
, text_msg text NOT NULL
, type_message smallint NOT NULL
, delete_by character varying(50) NOT NULL
, url_file character varying(1000) NOT NULL
, receiver_id integer
, info_date_str character varying(200) NOT NULL DEFAULT '2019-11-28 23:43:44.578419'
, was_removed boolean NOT NULL DEFAULT false
, CONSTRAINT chat_message_pkey PRIMARY KEY (id)
, CONSTRAINT chat_messag_receiver_id_0eceddde_fk_user_profile_userprofile_id FOREIGN KEY (receiver_id) REFERENCES public.user_profile_userprofile (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED
, CONSTRAINT chat_message_chat_id_fkey FOREIGN KEY (chat_room_id) REFERENCES public.chat_chat (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
, CONSTRAINT chat_message_owner_id_fkey FOREIGN KEY (owner_id) REFERENCES public.user_profile_userprofile (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED
);
CREATE INDEX chat_room_id_idx ON public.chat_message (chat_room_id);
CREATE INDEX owner_id_idx ON public.chat_message (owner_id);
CREATE INDEX receiver_id_idx ON public.chat_message (receiver_id);
CREATE TABLE public.user_profile_userprofile (
id integer NOT NULL DEFAULT nextval('user_profile_userprofile_id_seq'::regclass)
, user_id integer NOT NULL
, birthdate date
, profile_picture character varying(500) NOT NULL
, country character varying(500) NOT NULL
, phrase character varying(300) NOT NULL
, is_mod boolean NOT NULL
, ip inet, id_cel character varying(500) NOT NULL DEFAULT ''
, twitter_key character varying(500) DEFAULT ''
, twitter_secret character varying(500) DEFAULT ''
, gender character varying(500) NOT NULL DEFAULT ''
, lastseen character varying(500) NOT NULL DEFAULT ''
, profile_video character varying(500) NOT NULL DEFAULT ''
, status character varying(500) NOT NULL DEFAULT ''
, name character varying(500) NOT NULL
, "id_timeZone" character varying(500) NOT NULL
, profile_picture_thumb character varying(500) NOT NULL
, city character varying(500) NOT NULL
, CONSTRAINT user_profile_userprofile_pkey PRIMARY KEY (id)
, CONSTRAINT user_profile_userprofile_user_id_key UNIQUE (user_id)
, CONSTRAINT cascade_user FOREIGN KEY (user_id) REFERENCES public.auth_user (id) MATCH SIMPLE ON UPDATE CASCADE ON DELETE CASCADE
);
CREATE UNIQUE INDEX userprofile_user_idx ON public.user_profile_userprofile (user_id);
CREATE TABLE public.auth_user (
id integer NOT NULL DEFAULT nextval('auth_user_id_seq'::regclass)
, password character varying(128) NOT NULL
, last_login timestamp with time zone
, is_superuser boolean NOT NULL
, username character varying(150) NOT NULL
, first_name character varying(30) NOT NULL
, last_name character varying(30) NOT NULL
, email character varying(254) NOT NULL
, is_staff boolean NOT NULL
, is_active boolean NOT NULL
, date_joined timestamp with time zone NOT NULL
, CONSTRAINT auth_user_pkey PRIMARY KEY (id)
, CONSTRAINT auth_user_username_key UNIQUE (username)
);
CREATE INDEX auth_user_username_like ON public.auth_user (username varchar_pattern_ops);
CREATE INDEX username_idx ON public.auth_user (username);
我用这个查询来分析:
EXPLAIN ANALYZE SELECT "chat_message"."info_date_str", "auth_user"."username", "chat_message"."id", "chat_message"."text_msg", "chat_message"."type_message", "chat_message"."url_file", "chat_message"."url_file", "chat_message"."was_removed", "chat_message"."was_removed" FROM "chat_message" INNER JOIN "user_profile_userprofile" ON ("chat_message"."owner_id" = "user_profile_userprofile"."id") INNER JOIN "auth_user" ON ("user_profile_userprofile"."user_id" = "auth_user"."id") WHERE ("chat_message"."chat_room_id" = 7204712 AND NOT ("chat_message"."delete_by"::text LIKE '%fattimita08%')) ORDER BY "chat_message"."id" DESC LIMIT 20
QUERY PLAN
Limit (cost=33503.90..33614.03 rows=20 width=87) (actual time=5753.937..5754.447 rows=20 loops=1)
-> Nested Loop (cost=33503.90..85612.86 rows=9463 width=87) (actual time=5753.935..5754.443 rows=20 loops=1)
-> Nested Loop (cost=33503.48..63940.73 rows=9463 width=80) (actual time=5753.923..5754.399 rows=20 loops=1)
-> Gather Merge (cost=33503.06..34605.18 rows=9463 width=80) (actual time=5753.893..5754.349 rows=20 loops=1)
Workers Planned: 2
Workers Launched: 0
-> Sort (cost=32503.03..32512.89 rows=3943 width=80) (actual time=5753.535..5753.540 rows=20 loops=1)
Sort Key: chat_message.id DESC
Sort Method: quicksort Memory: 2477kB
-> Parallel Bitmap Heap Scan on chat_message (cost=129.77..32267.53 rows=3943 width=80) (actual time=81.934..5741.308 rows=10447 loops=1)
Recheck Cond: (chat_room_id = 7204712)
Filter: ((delete_by)::text !~~ '%fattimita08%'::text)
Rows Removed by Filter: 1
Heap Blocks: exact=8278
-> Bitmap Index Scan on chat_room_id_idx (cost=0.00..127.41 rows=9463 width=0) (actual time=63.921..63.922 rows=10448 loops=1)
Index Cond: (chat_room_id = 7204712)
-> Memoize (cost=0.43..4.95 rows=1 width=8) (actual time=0.002..0.002 rows=1 loops=20)
Cache Key: chat_message.owner_id
Cache Mode: logical
Hits: 18 Misses: 2 Evictions: 0 Overflows: 0 Memory Usage: 1kB
-> Index Scan using user_profile_userprofile_pkey on user_profile_userprofile (cost=0.42..4.94 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=2)
Index Cond: (id = chat_message.owner_id)
-> Index Scan using auth_user_pkey on auth_user (cost=0.42..2.29 rows=1 width=13) (actual time=0.002..0.002 rows=1 loops=20)
Index Cond: (id = user_profile_userprofile.user_id)
Planning Time: 43.104 ms
Execution Time: 5754.630 ms
此查询的主要困难在于它按chat_room_id
进行筛选,但按id
(降序(进行排序-在非常小的LIMIT
之前。
Postgres希望过滤器chat_room_id = 7204712
是选择性的,并从chat_room_id_idx
上的位图堆扫描开始,找到大约10k行。
附加滤波器CCD_ 6似乎几乎没有选择性。(顺便说一句,该表达式中的强制转换(::text
(没有任何用处,请删除它。(
这个多列索引应该会给你更快的结果,按数量级排列:
CREATE INDEX chat_message_your_name_here_idx ON public.chat_message (chat_room_id, id DESC);
现在,Postgres可以通过索引扫描遍历索引,并过滤少数不符合条件的行。
相关:
- 使用小LIMIT优化查询,一列上有谓词,另一列按顺序排列
也就是说,Parallel Bitmap Heap Scan on chat_message
承担了大部分成本——仅rows=10447
的成本似乎过高。那张桌子特别大吗?你有像样的硬件吗?也许桌子太胀了?匹配的行是否实际分布在大桌子上?根据实际诊断,新索引上的VACUUM
,甚至VACUUM FULL
,或者更好的CLUSTER
可能会有所帮助(很多(。类似:
CLUSTER public.chat_message USING chat_message_your_name_here_idx;
请注意,对于一张大桌子来说,这既昂贵又阻塞。有非阻塞性的替代方案。参见:
优化Postgres对时间戳范围的查询
PostgreSQL索引未用于查询IP范围
在一系列时间戳(两列(上优化查询