所以我似乎有一个时刻,不能弄清楚为什么某些dup表没有被删除。我有一个名为QUERY的测试表,其中包含姓名、地址、出生日期、电话等。我希望删除的dup,但要保持最新的记录(优选,但下面的代码不代表),其中电话不是空的。我的代码下面只是不工作总是给出0的结果。行示例如下:
+----------------------------------------------------------------+
| first,last,DOB,address,city,state,phonenumber,validitydate |
+----------------------------------------------------------------+
| steve,smith,19710922, 123 Here St, Miami, FL,9545551212,201902 |
| steve,smith,19710922, 123 Here St, Miami, FL,,202009 |
| steve,smith,19710922, 123 Here St, Miami, FL,9545551212,201802 |
+----------------------------------------------------------------+
WITH Records AS
(
SELECT lastname, firstname, address, state, dateofbirth, phonenumber
,ROW_NUMBER() OVER (PARTITION BY lastname, firstname, address, state, dateofbirth, phonenumber order by validitydate) AS RecordInstance
,ROW_NUMBER() OVER (PARTITION BY lastname, firstname, address, state, dateofbirth, phonenumber order by
CASE
WHEN phonenumber ='' THEN 0
WHEN phonenumber IS NOT NULL THEN 1
Else 0
END) as [ToInclude]
FROM query
)
delete
FROM records
WHERE
RecordInstance > 1
and ToInclude = 0
有人看到我做错了什么吗?提前感谢
我想你应该这样做:
SELECT
RANK() OVER (PARTITION BY lastname, firstname, address, state, dateofbirth, phonenumber order by validitydate) AS RecordInstance
,COUNT(*) OVER (PARTITION BY lastname, firstname, address, state, dateofbirth, phonenumber) DUPS
FROM QUERY
WHERE DUPS > 2
AND RecordInstance = 1
AND phonenumber <> '' -- maybe? if you really don't want to delete duplicates with no phone number
如果您需要的结果是删除所有重复的记录,其中电话号码不是null或空,那么您可以尝试下面的查询:
WITH Records AS
(
SELECT lastname, firstname, address, state, dateofbirth, phonenumber
,ROW_NUMBER() OVER (PARTITION BY lastname, firstname, address, state, dateofbirth, phonenumber order by validitydate) AS RecordInstance
FROM query where phonenumber<>'' and phonenumber is not null
)
delete
FROM records
WHERE
RecordInstance > 1
Try
RecordInstance = ROW_NUMBER() OVER(PARTITION BY LastName, FirstName, Address, City, DateOfBirth ORDER BY PhoneNumber DESC, ValidityDate)