在 sql 服务器中使用分区和 rank() 删除重复记录时遇到问题



我正在尝试识别重复的记录,然后使用PARTITION和RANK()n SQL Server 2008删除其中一个重复的记录。删除重复记录的条件是它不应该在另一个表中引用。

我有一个语言表,其中包含一些重复的语言。员工表包含员工和映射到语言。如果该语言 ID 未在员工表中映射,我必须删除该重复记录之一。

CREATE TABLE MY_LANGUAGE (LANGUAGEID INT, LANGUAGENAME VARCHAR(20))
CREATE TABLE MY_EMPLOYEE (EMPID INT, NAME VARCHAR(20), LANGUAGEID INT)
INSERT INTO MY_LANGUAGE VALUES(1, 'ENGLISH')
INSERT INTO MY_LANGUAGE VALUES(2, 'FRENCH')
INSERT INTO MY_LANGUAGE VALUES(3, 'ITALIAN')
INSERT INTO MY_LANGUAGE VALUES(4, 'GERMAN')
INSERT INTO MY_LANGUAGE VALUES(5, 'ITALIAN')
INSERT INTO MY_LANGUAGE VALUES(6, 'GERMAN')
INSERT INTO MY_LANGUAGE VALUES(7, 'SPANISH')
INSERT INTO MY_EMPLOYEE VALUES (10, 'GLEN', 1)
INSERT INTO MY_EMPLOYEE VALUES (20, 'PETER', 2)
INSERT INTO MY_EMPLOYEE VALUES (30, 'MARIA', 3)

如果你看到,我有两种重复的语言,其中一种被员工使用。我想删除语言 ID 4 和 5。

LANGUAGENAME    LANGUAGEID  EMPNAME
GERMAN          4   
GERMAN          6   
ITALIAN             3               MARIA   
ITALIAN             5   

我尝试创建一个选择语句来返回我要删除的内容:

WITH CTE AS (
SELECT  L.LANGUAGENAME,  L.LANGUAGEID, RANK() OVER(PARTITION BY L.LANGUAGENAME ORDER BY L.LANGUAGEID) AS RANKING
FROM MY_LANGUAGE L
INNER JOIN (
    SELECT LANGUAGENAME, COUNT(*) AS DUPECOUNT
    FROM MY_LANGUAGE
    GROUP BY LANGUAGENAME
    HAVING COUNT(*) > 1
) LC ON L.LANGUAGENAME = LC.LANGUAGENAME
WHERE NOT EXISTS (SELECT 1 FROM MY_EMPLOYEE WHERE MY_EMPLOYEE.LANGUAGEID = L.LANGUAGEID))  
SELECT * FROM CTE WHERE RANKING = 1

这将返回以下内容

LANGUAGENAME    LANGUAGEID  RANKING
GERMAN              4           1
ITALIAN         5           1

当我尝试删除时,出现错误:

WITH CTE AS (
SELECT  L.LANGUAGENAME,  L.LANGUAGEID, RANK() OVER(PARTITION BY L.LANGUAGENAME ORDER BY L.LANGUAGEID) AS RANKING
FROM MY_LANGUAGE L
INNER JOIN (
    SELECT LANGUAGENAME, COUNT(*) AS DUPECOUNT
    FROM MY_LANGUAGE
    GROUP BY LANGUAGENAME
    HAVING COUNT(*) > 1
) LC ON L.LANGUAGENAME = LC.LANGUAGENAME
WHERE NOT EXISTS (SELECT 1 FROM MY_EMPLOYEE WHERE MY_EMPLOYEE.LANGUAGEID = L.LANGUAGEID))  
DELETE FROM CTE WHERE RANKING = 1

我得到的错误是:

Msg 4405,级别 16,状态 1,第 1
行 视图或函数"CTE"不可更新,因为修改会影响多个基表。

任何如何解决此问题的想法都可以简化。感谢@Szymon展示临时表解决方案,但我希望得到一个没有临时表的解决方案(如果可能的话)。

查询:

DELETE ll
FROM MY_LANGUAGE ll
JOIN (SELECT L.LANGUAGENAME,
       L.LANGUAGEID,
       ROW_NUMBER()OVER(PARTITION BY L.LANGUAGENAME, e.EMPID
                        ORDER BY L.LANGUAGEID ASC) rnk,
       COUNT(*)OVER(PARTITION BY L.LANGUAGENAME) cnt,
       e.EMPID
      FROM MY_LANGUAGE l
      LEFT JOIN MY_EMPLOYEE e ON e.LANGUAGEID = l.LANGUAGEID)  a 
    ON ll.LANGUAGEID = a.LANGUAGEID
AND a.rnk = 1
AND a.cnt > 1
and a.EMPID IS NULL

结果:

| LANGUAGEID | LANGUAGENAME |
|------------|--------------|
|          1 |      ENGLISH |
|          2 |       FRENCH |
|          3 |      ITALIAN |
|          6 |       GERMAN |
|          7 |      SPANISH |

您可以将 CTE 查询中的记录获取到临时表中,然后基于该表进行删除。

WITH CTE AS (
SELECT  L.LANGUAGENAME,  L.LANGUAGEID, RANK() OVER(PARTITION BY L.LANGUAGENAME ORDER BY L.LANGUAGEID) AS RANKING
FROM MY_LANGUAGE L
INNER JOIN (
    SELECT LANGUAGENAME, COUNT(*) AS DUPECOUNT
    FROM MY_LANGUAGE
    GROUP BY LANGUAGENAME
    HAVING COUNT(*) > 1
) LC ON L.LANGUAGENAME = LC.LANGUAGENAME
WHERE NOT EXISTS (SELECT 1 FROM MY_EMPLOYEE WHERE MY_EMPLOYEE.LANGUAGEID = L.LANGUAGEID))  
SELECT * INTO #temp FROM CTE WHERE RANKING = 1

然后使用#temp中的记录从原始表中删除。

试试这个。

;WITH CTE AS (
SELECT t.LANGUAGEID, LANGUAGENAME, RANK() OVER(PARTITION BY T.LANGUAGENAME ORDER BY T.LANGUAGEID) AS RANKING
FROM MY_LANGUAGE T 
WHERE T.LANGUAGEID  IN (
                        SELECT  L.LANGUAGEID 
                        FROM MY_LANGUAGE L LEFT JOIN MY_EMPLOYEE E
                        ON L.LANGUAGEID = E.LANGUAGEID
                        WHERE E.LANGUAGEID IS NULL)
) 
DELETE FROM CTE
WHERE RANKING = 1 AND (CTE.LANGUAGENAME IN (SELECT LANGUAGENAME 
                                        FROM MY_LANGUAGE 
                                        GROUP BY LANGUAGENAME
                                        HAVING COUNT(LANGUAGENAME) > 1))

最新更新