Postgresql更新查询每次都需要很长时间才能完成

我有一个PostgreSQL DB表user_book_details，包含451007条记录。user_book_details表每天都会填充大约1K条新记录。

我有以下问题，每次都需要很长时间(13小时(才能完成。

update user_book_details as A1 set min_date=
(select min(A2.acc_date) as min_date from user_book_details A2 where A2.user_id=A1.user_id 
and A2.book_id=A1.book_id) where A1.min_date is null;

如何重写查询以提高性能？仅供参考，user_id和book_id列上没有索引。

您的查询正常：

update user_book_details ubd
set min_date = (select min(ubd2.acc_date) 
from user_book_details ubd2
where ubd2.user_id = ubd.user_id and
ubd2.book_id = ubd.book_id
)
where ubd.min_date is null;

为了提高性能，您需要user_book_details(user_id, book_id)上的索引。我还认为这样写会更快：

update user_book_details ubd
set min_date = min_acc_date
from (select ubd2.user_id, ubd2.book_id, min(ubd2.acc_date) as min_acc_date
from user_book_details ubd2
group by ubd2.user_id, ubd2.book_id
) ubd2
where ubd2.user_id = ubd.user_id and
ubd2.book_id = ubd.book_id and
ubd.min_date is null;

第一种方法使用索引查找每一行的值(更新同一查询时可能会有点复杂(。第二种方法聚合数据，然后加入值。

我应该注意，这个值很容易在飞行中计算：

select ubd.*,
min(acc_date) over (partition by user_id, book_id) as min_acc_date
from user_book_details ubd;

这可能比试图在表中保持它的最新状态更可取。

相关内容

最新更新

热门标签：