Postgres:更新插入一行并更新一个主键列



假设我的Postgres数据库中有两个表:

create table transactions
(
id bigint primary key,
doc_id bigint not null,
-- lots of other columns...
amount numeric not null
);
-- same columns
create temporary table updated_transactions
(
id bigint primary key,
doc_id bigint not null,
-- lots of other columns...
amount numeric not null
);

这两个表都只有一个主键,没有唯一的索引。

我需要使用以下规则将行从updated_transactions更新为transactions

  • transactionsupdated_transactions中的 id 列值不匹配
  • 其他列如doc_id等(amount除外)应该匹配
  • 找到匹配的行后,更新amount列和id
  • 当找不到匹配的行时,插入它

updated_transactions中的id值取自序列。 业务对象只是填充updated_transactions然后合并 使用 upsert 查询将其中的新行或更新行转换为transactions。 所以我旧的未更改交易保持其id完好无损,而更新的交易则保持完整 被分配新的id

在MSSQL和Oracle中,这将是一个类似于以下内容的merge语句:

merge into transactions t
using updated_transactions ut on t.doc_id = ut.doc_id, ...
when matched then
update set t.id = ut.id, t.amount = ut.amount
when not matched then
insert (t.id, t.doc_id, ..., t.amount)
values (ut.id, ut.doc_id, ..., ut.amount);

在PostgreSQL中,我想它应该是这样的:

insert into transactions(id, doc_id, ..., amount)
select coalesce(t.id, ut.id), ut.doc_id, ... ut.amount
from updated_transactions ut
left join transactions t on t.doc_id = ut.doc_id, ....
on conflict
on constraint transactions_pkey
do update
set amount = excluded.amount, id = excluded.id

问题出在do update子句上:excluded.id是一个旧值 从transactions表中,而我需要来自updated_transactions的新值.

do update子句无法访问ut.id值,我唯一能做的 use 是excluded行。但excluded行只有coalesce(t.id, ut.id)返回现有行的旧id值的表达式。

是否可以使用 upsert 查询更新id列和amount列?

在用作键的列上创建唯一索引,并在更新插入表达式中传递其名称,以便它使用它而不是 pkey。 然后,如果未找到匹配项,它将使用updated_transactions中的 ID 插入行。如果找到匹配项,则可以使用 excluded.id 从updated_transactions获取 ID。

我认为left join transactions是多余的。

所以它看起来有点像这样:

insert into transactions(id, doc_id, ..., amount)
select ut.id, ut.doc_id, ... ut.amount
from updated_transactions ut
on conflict
on constraint transactions_multi_column_unique_index
do update
set amount = excluded.amount, id = excluded.id

看起来可以使用可写 CTE 而不是普通的 upsert 来完成任务。

首先,我将发布更简单的查询版本,以回答原始问题。此解决方案假定doc_id, unit_id列寻址候选键,但不需要这些列的唯一索引。

测试数据:

create temp table transactions
(
id bigint primary key,
doc_id bigint,
unit_id bigint,
amount numeric
);
create temp table updated_transactions
(
id bigint primary key,
doc_id bigint,
unit_id bigint,
amount numeric
); 
insert into transactions(id, doc_id, unit_id, amount)
values (1, 1, 1, 10), (2, 1, 2, 15), (3, 1, 3, 10);
insert into updated_transactions(id, doc_id, unit_id, amount)
values (6, 1, 1, 11), (7, 1, 2, 15), (8, 1, 4, 20); 

updated_transactions合并到transactions的查询:

with new_values as
(
select ut.id new_id, t.id old_id, ut.doc_id, ut.unit_id, ut.amount 
from updated_transactions ut
left join transactions t 
on t.doc_id = ut.doc_id and t.unit_id = ut.unit_id
),
updated as
(
update transactions tr
set id = nv.new_id, amount = nv.amount
from new_values nv
where id = nv.old_id
returning tr.*
)
insert into transactions(id, doc_id, unit_id, amount)
select ut.new_id, ut.doc_id, ut.unit_id, ut.amount
from new_values ut
where ut.new_id not in (select id from updated);

结果:

select * from transactions
-- id | doc_id | unit_id | amount
------+--------+---------+-------
--  3 |   1    |    3    |  10    -- not changed
--  6 |   1    |    1    |  11    -- updated
--  7 |   1    |    2    |  15    -- updated 
--  8 |   1    |    4    |  20    -- inserted

在我的实际应用程序中,doc_id, unit_id并不总是唯一的,因此它们不代表候选键。为了匹配行,我考虑了行号,该行号是按行ids排序的。这是我的第二个解决方案。

测试数据:

-- the tables are the same as above
insert into transactions(id, doc_id, unit_id, amount)
values (1, 1, 1, 10), (2, 1, 1, 15), (3, 1, 3, 10);
insert into updated_transactions(id, doc_id, unit_id, amount)
values (6, 1, 1, 11), (7, 1, 1, 15), (8, 1, 4, 20); 

合并查询:

with trans as
(
select id, doc_id, unit_id, amount,
row_number() over(partition by doc_id, unit_id order by id) row_num
from transactions
),
updated_trans as
(
select id, doc_id, unit_id, amount,
row_number() over(partition by doc_id, unit_id order by id) row_num
from updated_transactions
),
new_values as
(
select ut.id new_id, t.id old_id, ut.doc_id, ut.unit_id, ut.amount 
from updated_trans ut
left join trans t 
on t.doc_id = ut.doc_id and t.unit_id = ut.unit_id and t.row_num = ut.row_num
),
updated as
(
update transactions tr
set id = nv.new_id, amount = nv.amount
from new_values nv
where id = nv.old_id
returning tr.*
)
insert into transactions(id, doc_id, unit_id, amount)
select ut.new_id, ut.doc_id, ut.unit_id, ut.amount
from new_values ut
where ut.new_id not in (select id from updated);

结果:

select * from transactions;
-- id | doc_id | unit_id | amount
------+--------+---------+-------
--  3 |   1    |    3    | 10     -- not changed
--  6 |   1    |    1    | 11     -- updated
--  7 |   1    |    1    | 15     -- updated
--  8 |   1    |    4    | 20     -- inserted

引用:

  • 在PostgreSQL中插入重复更新
  • 通过可写 CTE 更新插入
  • 等待 9.1 — 可写 CTE
  • 为什么UPSERT如此复杂?

最新更新