查询超时时如何删除重复项



postgresql数据库中有一个表有重复的值,我想添加一个约束来防止这种情况发生,但由于删除查询运行时间过长,我无法删除重复的值。我正在尝试运行的查询:

delete from table 
where id in 
(
SELECT id
FROM 
(
SELECT id,
ROW_NUMBER() OVER( PARTITION BY a.user_id, a.timestamp_utc ORDER BY  id ) AS row_num
FROM table  a
) t
WHERE t.row_num > 1 
)

字段时间戳_ utc已经确定,但查询从未完成的运行

首先建议为together字段(user_id,timestamp_utc(创建索引:

CREATE INDEX table_idx ON table USING btree (user_id, timestamp_utc);

我为你写了几个不同的问题。您可能需要。删除所有重复数据后,可以为这些字段添加唯一索引,这样以后就不可能添加重复数据这不正确-每次都删除重复项

示例查询:

delete from table 
where id in 
(
SELECT id
FROM 
(
SELECT id,
ROW_NUMBER() OVER( PARTITION BY a.user_id, a.timestamp_utc ORDER BY  id ) AS row_num
FROM table  a
) t
WHERE t.row_num > 1 
);

delete from table a1
using  
(
SELECT id
FROM 
(
SELECT id,
ROW_NUMBER() OVER( PARTITION BY a.user_id, a.timestamp_utc ORDER BY  id ) AS row_num
FROM table  a
) t
WHERE t.row_num > 1 
) a2 
where a1.id = a2.id;

delete from table where id in (
select a.id from table a 
left join 
(
select min(id) as id from table
group by user_id, timestamp_utc  
having count(*) > 1 
) b on a.id = b.id 
where b.id is null 
);

delete from table a1 
using 
(   select a.id from table a 
left join 
(
select min(id) as id from table
group by user_id, timestamp_utc  
having count(*) > 1 
) b on a.id = b.id 
where b.id is null 
) b1 
where a1.id = b1.id;

最新更新