删除 SQL 中的无效重复行



我有一个表格,它根据用户名存储了通过Time Machine的员工签到时间。如果员工多次打卡,那么他的签到就会有多条记录,中间只有几秒钟的时差。显然,只有第一条记录是有效的。所有其他条目都无效,必须从表中删除。如果可以选择当前日期员工的所有签到记录,该怎么办?

数据库中的数据如下。

Username               Checktime                       CheckType
 HRA001            7/29/2012 8:16:44 AM                Check-In
 HRA001            7/29/2012 8:16:46 AM                Check-In
 HRA001            7/29/2012 8:16:50 AM                Check-In 
 HRA001            7/29/2012 8:16:53 AM                Check-In 

试试这个:

 ;WITH users_CTE as (
 select rank() over (partition by Username order by Checktime) as rnk from users
 )
 DELETE FROM users_CTE where rnk <> 1
-

-对于您的第二个要求,请尝试此查询

 ;WITH users_CTE as (
 select *,rank() over (partition by Username order by Checktime) as rnk from users
 )
,CTE2 as (select Username,MIN(CheckTime) as minTime,DATEADD(mi,1,MIN(CheckTime)) as maxTime from users_CTE 
 group by Username)

delete from users where Checktime in(
select c1.Checktime from users_CTE c1 left join CTE2 c2
on c1.Checktime > c2.minTime and c1.Checktime <= c2.maxTime
where c2.Username is not null and c1.Username in(
select c1.Username from users_CTE c1 left join CTE2 c2
on c1.Checktime > c2.minTime and c1.Checktime <= c2.maxTime
group by c1.Username,c2.Username 
having COUNT(*) > 1))
-

-对于您更改的要求,请检查下面的查询

alter table users add flag varchar(2)
;WITH users_CTE as (
 select *,rank() over (partition by Username order by Checktime) as rnk from users
 )
,CTE2 as (select Username,MIN(CheckTime) as minTime,DATEADD(mi,1,MIN(CheckTime)) as maxTime from users_CTE 
 group by Username)

update u SET u.flag = 'd' from users_CTE u inner join (
select c1.Checktime from users_CTE c1 left join CTE2 c2
on c1.Checktime > c2.minTime and c1.Checktime <= c2.maxTime
where c2.Username is not null and c1.Username in(
select c1.Username from users_CTE c1 left join CTE2 c2
on c1.Checktime > c2.minTime and c1.Checktime <= c2.maxTime
group by c1.Username,c2.Username 
having COUNT(*) > 1)) a
on u.Checktime=a.Checktime
-

-使用删除标志检查最新查询

;WITH users_CTE as 
(
 select *,row_number() over (partition by Username order by Checktime) as row from users
)
,CTE as(
select row,Username,Checktime,CheckType,0 as totalSeconds,'N' as Delflag from users_CTE where row=1 
union all
select t.row,t.Username,t.Checktime,t.CheckType,CASE WHEN (c.totalSeconds + DATEDIFF(SECOND,c.Checktime,t.Checktime))  >= 60 then 0 else (c.totalSeconds + DATEDIFF(SECOND,c.Checktime,t.Checktime)) end as totalSeconds,
CASE WHEN (c.totalSeconds + DATEDIFF(SECOND,c.Checktime,t.Checktime))  >= 60 then 'N' else 'Y' end as Delflag
--CASE WHEN c.totalSeconds <= 60  then 'Y' else 'N' end as Delflag
from users_CTE t inner join CTE c
on t.row=c.row+1
)
select Username,Checktime,CheckType,Delflag from CTE

为什么不在将签入插入数据库之前验证签入。如果此用户存在任何签入,则在此日期和该日期之间,则不执行任何其他操作将其插入

您应该能够按时间对所有记录进行排序,从每个员工的上一次时间中减去最新时间,如果结果小于特定阈值,则删除具有最近时间的行。

您可以尝试按签入时间对记录进行排名,然后删除每天 RANK 大于 1 的每个员工的所有记录。

尝试以下查询:从员工签入位置删除(从员工计数(签入)>1 中选择签入);

http://codesimplified.com/2010/10/18/remove-duplicate-records-from-the-database-table/

希望这对你有帮助。

DELETE FROM timesheet 
WHERE timesheetRecordId <>(
                SELECT TOP 1 timesheetRecordId from timesheet  
                WHERE checkInDate=todaysDate AND employeeId=empId ORDER BY checkInTime ASC
               ) 
AND checkInDate=today's date AND empolyeeId=empId;

我认为您无法在同一语句的子查询中从 Delete 语句中指定目标表。因此,您不能使用单个 Delete 语句来做到这一点。

您可以做的是编写一个存储过程。在存储过程中,应创建一个临时表,其中包含此查询返回的 PK:

select cht.pkey 
  from CheckTimeTable as cht
  where exists ( select pkey
                   from CheckTimeTable 
                   where username = cht.userName
                     and checkType = 'check-IN'
                     and Checktime >= subtime(cht.Checktime, '0 0:0:15.000000') 
                     and Checktime < cht.Checktime);

然后编写另一个语句以从原始表CheckTimeTable中删除这些PK。

请注意,上面的查询是针对 MySQL 的,因此您需要找到从 DBMS 的时间戳中减去 15 秒的方法。在MySQL中,它是这样完成的:

subtime(cht.Checktime, '0 0:0:15.000000')

此查询将返回具有来自同一用户的另一个 CheckTime 记录的任何 CheckTime 记录,类型为 Check-In,并且比其自己的 Checktime 早 15 秒。

最新更新