PostgreSQL死锁只更新一行



今天,我们的生产数据库(Aurora PostgreSQL 9.6.3(遇到了死锁情况,其中多个进程试图对一行执行相同的UPDATE查询。我们认为死锁只会在无序更新多行时发生,所以这是一个惊喜;然而,它确实发生在我们一天中最繁忙的时候。

这是我们 Python 代码中包含 UPDATE 语句的事务(这是一个穷人的UPSERT(:

with self.connection.cursor() as cursor:
    cursor.execute("""UPDATE students SET name = %s WHERE uuid = %s AND activity_id = %s""", (name, uuid, activityId))
    if cursor.rowcount <= 0:
        cursor.execute("""INSERT INTO students (name, uuid, activity_id) VALUES (%s, %s, %s)""", (name, uuid, activityId))
    if cursor.rowcount <= 0:
        self.connection.rollback()
        raise BaseDao.NotUpserted("No student name was updated or inserted for activity_id %d and uuid %s" % (activityId, uuid))
    else:
        self.connection.commit()

下面是日志中的一些相关行,包括仅更新一行的简单查询:

...
2018-01-19 16:21:27 UTC:[38161]:ERROR:  deadlock detected
2018-01-19 16:21:27 UTC:[38161]:DETAIL:  Process 38161 waits for ShareLock on transaction 90490253; blocked by process 25147.
    Process 25147 waits for ShareLock on transaction 90490267; blocked by process 38161.
    Process 38161: UPDATE students SET name = 'foobar' WHERE uuid = 'ca1b2d153cbdc9574cce' AND activity_id = 35473237
    Process 25147: UPDATE students SET name = 'foobar' WHERE uuid = 'ca1b2d153cbdc9574cce' AND activity_id = 35473237
...

以下是两个相关表:

db=> d students
                                        Table "public.students"
    Column   |          Type          |                             Modifiers                             
-------------+------------------------+-------------------------------------------------------------------
 id          | integer                | not null default nextval('students_id_seq'::regclass)
 name        | character varying(128) | not null
 uuid        | character varying(40)  | not null
 activity_id | integer                | not null
Indexes:
    "students_pkey" PRIMARY KEY, btree (id)
    "students_activity_id" btree (activity_id)
Foreign-key constraints:
    "activity_id_refs_id_76c08098" FOREIGN KEY (activity_id) REFERENCES activities(id) DEFERRABLE INITIALLY DEFERRED
db=> d activities
                                       Table "public.activities"
      Column       |           Type           |                              Modifiers                               
-------------------+--------------------------+----------------------------------------------------------------------
 id                | integer                  | not null default nextval('activities_id_seq'::regclass)
 start_time        | timestamp with time zone | not null
 end_time          | timestamp with time zone | 
 activity_type     | character varying(2)     | not null
 activity_id       | integer                  | not null
 started_by_id     | integer                  | not null
 activity_state    | integer                  | not null
 legacy_id         | integer                  | 
 hide_report       | boolean                  | not null
 report_status     | integer                  | 
 students_finished | text                     | not null
 room_name         | text                     | 
 last_updated      | timestamp with time zone | 
 state             | integer                  | 
Indexes:
    "activities_pkey" PRIMARY KEY, btree (id)
    "activities_end_time" btree (end_time)
    "activities_room_name_c1f9997a_like" btree (room_name text_pattern_ops)
    "activities_room_name_c1f9997a_uniq" btree (room_name)
    "activities_started_by_id" btree (started_by_id)
Foreign-key constraints:
    "started_by_id_refs_id_5ea35c7a" FOREIGN KEY (started_by_id) REFERENCES users(id) DEFERRABLE INITIALLY DEFERRED
Referenced by:
    TABLE "students" CONSTRAINT "activity_id_refs_id_76c08098" FOREIGN KEY (activity_id) REFERENCES activities(id) DEFERRABLE INITIALLY DEFERRED

当只更新一行时,我们怎么会陷入这样的僵局?

我可以想到造成这种僵局的两个原因:

  1. 执行更新的事务包含多个语句,其他语句也创建锁。

  2. 涉及创建其他锁的触发器。

请记住,死锁

不是错误,除非它们发生得太频繁;处理死锁失败是一个错误。只需重试失败的事务即可。

最新更新