我有一个似乎已损坏的索引?
以下是正在发生的事情。我有两个表函数,第一个是一组案例,第二个是一套已知日期。这两个集合具有1(事例)到0或1(已知日期)的关系。通常我会这样询问他们;
SELECT c.CaseID, a.AwareDate
FROM Cases(@date) AS c
LEFT JOIN AwareDates(@date) AS a ON c.CaseID = a.CaseID;
问题是,并非AwareDates中所有匹配的行都是JOIN'd。如果我添加了一个联接提示,那么它们就会执行。say;
SELECT c.CaseID, a.AwareDate
FROM Cases(@date) AS c
LEFT MERGE JOIN AwareDates(@date) AS a ON c.CaseID = a.CaseID;
我从查询计划中注意到,添加联接提示会在联接之前添加一种AwareDate数据,否则就不存在这种数据。此外,当没有提示时,查询规划器将联接翻转为RIGHT OUTER join,当然,在有提示的地方保留LEFT join。
我已经完成了以下操作,没有检测到错误;
DBCC UPDATEUSAGE (0) WITH INFO_MESSAGES, COUNT_ROWS;
EXECUTE sp_updatestats 'resample';
DBCC CHECKDB (0) WITH ALL_ERRORMSGS, EXTENDED_LOGICAL_CHECKS;
我被难住了。。。有什么想法吗?
以下是UDF定义
ALTER FUNCTION dbo.Cases( @day date ) RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
SELECT
CaseID -- other 42 columns ommitted
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY CaseID ORDER BY UpdateDate DESC, UpdateNumber DESC) AS RecordAge,
CaseID,
Action
FROM
dbo.CaseAudit
WHERE
convert(date,UpdateDate) <= @day
) AS History
WHERE
RecordAge = 1 -- only the most current record version
AND isnull(Action,'') != N'DEL' -- only include cases that have not been deleted
)
ALTER FUNCTION dbo.AwareDates( @day date ) RETURNS TABLE
WITH SCHEMABINDING
AS RETURN (
WITH
History AS (
SELECT row_number() OVER (PARTITION BY CaseID, ContactID ORDER BY UpdateDate DESC, UpdateNumber DESC) AS RecordAge,
CaseID, InfoReceived, ReceiveDate, ResetClock, Action
FROM dbo.ContactLogAudit WITH (NOLOCK)
WHERE convert(date,UpdateDate) <= @day
),
Notes AS (
SELECT
CaseID,
convert(date,ReceiveDate,112) AS ReceiveDate,
ResetClock
FROM History
WHERE RecordAge = 1 -- only the most current record version
AND isnull(Action,'') != N'DEL' -- only include notes that have not been deleted
AND InfoReceived = N'Y' -- only include notes that have Info Rec'd checked
AND len(ReceiveDate) = 8 AND isnumeric(ReceiveDate) = 1 AND isdate(ReceiveDate) = 1 -- only include those with a valid aware date
),
Initials AS (
SELECT CaseID, min(ReceiveDate) AS ReceiveDate
FROM Notes
GROUP BY CaseID
),
Resets AS (
SELECT CaseID, max(ReceiveDate) AS ReceiveDate
FROM Notes
WHERE ResetClock = N'Y'
GROUP BY CaseID
)
SELECT
i.CaseID AS CaseID,
i.ReceiveDate AS InitialAwareDate, -- the oldest valid aware date value (must have AE Info Reveived checked and a received date)
coalesce(r.ReceiveDate,i.ReceiveDate) AS AwareDate -- either the newest valid aware date value with the Reset Clock checked, otherwise the initial aware date value
FROM Initials AS i
LEFT JOIN Resets AS r
ON i.CaseID = r.CaseID
);
我进一步发现,如果我去掉"WITH(NOLOCK)"表提示,我会得到正确的结果。此外,如果向AwareDates UTF添加联接提示,甚至在Initials和Resets之间的LEFT join关系上添加COLLATE Latin1_General_BIN。
查询计划行计数--没有联接提示(已损坏)
- 案例{实际:25891,估计:19071.9}
- 等待日期{实际:24693,估计:1463.09}
- 首字母缩写{实际:24693,估计:1463.09}
- Rests{实际:985,估计:33.2671}
- AwareDates匹配联接结果集中的8108个案例行
查询计划行计数——带联接提示(工作)
- 案例{实际:25891,估计:19071.9}
- 等待日期{实际:24673,估计:1837.67}
- 首字母缩写{实际:24673,估计:1837.67}
- 休息{实际:982,估计:42.6238}
- AwareDates匹配联接结果集中的24673个Cases行
我进一步缩小了问题的范围。我可以;
SELECT * FROM AwareDate(@date);
和
SELECT * FROM AwareDate(@date) ORDER BY CaseID;
具有不同的行数。
您没有指定SQL的特定版本(@@version),但这看起来像是SQL 2008 R2的累积更新6中修复的一个错误(显然它也适用于SQL 2008)。
KB 2433265
修复:当您运行使用ROW_NUMBER函数以及SQL Server 2008 中的左外部联接
文章中的示例指定了DISTINCT。然而,这篇文章的措辞模棱两可——不清楚你是否需要一个distinct,或者distinct是否是触发器之一。
你的例子没有像文章中那样的独特之处,但它似乎是为了提问而修改的(即缺少42列)。有明显的区别吗?同样,在AwareDates
udf中,当我进入Initials
CTE时,你会执行GROUP by,它可能具有与DISTINCT相同的效果。
更新
@Dennis从您的评论中我仍然无法判断您使用的是SQL 20080还是2008 R2。
如果您运行的是2008,KB的文章说:"这个问题的修复程序是在SQLServer2008ServicePack1的累积更新11中首次发布的。"。
另一方面,如果您使用的是SQL 2008 R2,那么这是在CU 6中修复的,它是SP1的一部分。但这个bug似乎又出现了。查看SQL Server 2008 R2 Service Pack 1的累积更新包4(SP1之后发布)。
970198 FIX: You receive an incorrect result when you run a
query that uses the row_number function in SQL Server 2008
or in SQL Server 2008 R2
在相关的KB文章中,MS删除了对distinct:的引用
Consider the following scenario. You run a query against a table that has a
clustered index in Microsoft SQL Server 2008 or in Microsoft SQL Server 2008
R2. In the query, you use the row_number function. In this scenario, you
receive an incorrect result when a parallel execution plan is used for the
query. If you run the query many times, you may receive different results.
这似乎证实了我之前对KB 2433265的阅读——该短语表明,distinct只是导致这种行为的众多条件之一。这次的罪魁祸首似乎是一个平行的执行计划。