postgresql数据库中有一个表有重复的值,我想添加一个约束来防止这种情况发生,但由于删除查询运行时间过长,我无法删除重复的值。我正在尝试运行的查询:
delete from table
where id in
(
SELECT id
FROM
(
SELECT id,
ROW_NUMBER() OVER( PARTITION BY a.user_id, a.timestamp_utc ORDER BY id ) AS row_num
FROM table a
) t
WHERE t.row_num > 1
)
字段时间戳_ utc已经确定,但查询从未完成的运行
首先建议为together字段(user_id,timestamp_utc(创建索引:
CREATE INDEX table_idx ON table USING btree (user_id, timestamp_utc);
我为你写了几个不同的问题。您可能需要。删除所有重复数据后,可以为这些字段添加唯一索引,这样以后就不可能添加重复数据这不正确-每次都删除重复项
示例查询:
delete from table
where id in
(
SELECT id
FROM
(
SELECT id,
ROW_NUMBER() OVER( PARTITION BY a.user_id, a.timestamp_utc ORDER BY id ) AS row_num
FROM table a
) t
WHERE t.row_num > 1
);
delete from table a1
using
(
SELECT id
FROM
(
SELECT id,
ROW_NUMBER() OVER( PARTITION BY a.user_id, a.timestamp_utc ORDER BY id ) AS row_num
FROM table a
) t
WHERE t.row_num > 1
) a2
where a1.id = a2.id;
delete from table where id in (
select a.id from table a
left join
(
select min(id) as id from table
group by user_id, timestamp_utc
having count(*) > 1
) b on a.id = b.id
where b.id is null
);
delete from table a1
using
( select a.id from table a
left join
(
select min(id) as id from table
group by user_id, timestamp_utc
having count(*) > 1
) b on a.id = b.id
where b.id is null
) b1
where a1.id = b1.id;