查询需要永远才能输出信息 - 优化提示



我有一个查询,效果很好 - 但是 - 加载大约需要 10 分钟。 这太疯狂了。我希望它比现在运行得更快。

想知道我是否可以采取任何技巧来优化我的查询以使其运行得更快?

select DISTINCT
    c.PaperID, 
    cdd.CodesF, 
    c.PageCount,  
    prr.projectname,  
    u.firstname + ' ' + u.lastname as Name,
    ett.EventName, 
    cast(c.AssignedDate as DATE) [AssignedDate],
    cast(ev.EventCompletionDate as DATE) [CompletionDate],
    ar.ResultDescription,
    a.Editor
from tbl_Papers c
    left outer join (select cd.PaperId, count(*) as CodesF 
                     from tbl_PaperCodes cd group by cd.PaperId) cdd 
        on cdd.PaperId = c.PaperId
    left outer join 
        (SELECT 
            wfce.PaperEventActionNum,
            c.PaperId,
            CONVERT(varchar,wfce.ActionDate,101) CompletionDate,
            pr.ProjectName,
            wfce.ActionUserId,
            u.firstname+' '+u.lastname [Editor]
        FROM 
            dbo.tbl_WFPaperEventActions wfce
            INNER JOIN dbo.tbl_Papers c ON wfce.PaperId = c.PaperId
            INNER JOIN tbl_Providers p ON p.ProviderID = c.ProviderID
            INNER JOIN tbl_Sites s ON s.SiteID = p.SiteID
            INNER JOIN tbl_Projects pr ON s.ProjectId=pr.ProjectId
            INNER JOIN tbl_Users u ON wfce.ActionUserId=u.UserId
        WHERE
            wfce.EventId = 204 
            AND c.Papersource =0
        GROUP BY
            wfce.PaperEventActionNum,
            c.PaperId,
            CONVERT(varchar,wfce.ActionDate,101),
            pr.ProjectName,
            wfce.ActionUserId,
            u.firstname+' '+u.lastname
        )a ON a.PaperId=c.PaperId,
    tbl_Providers p, tbl_Sites s, 
    tbl_Projects prr, tbl_WFPaperEvents ev, 
    tbl_Users u, tbl_WFPaperEventTypes ett, 
    tbl_WFPaperEventActions arr, tbl_WFPaperEventActionResults ar
where s.SiteId = p.SiteId
    and p.ProviderId = c.ProviderId
    and s.ProjectId = prr.ProjectId
    and ev.PaperId = c.PaperId
    and ev.EventCreateUserId = u.UserId
    and ev.EventCompletionDate >= dateadd(day,datediff(day,1,GETDATE()),0) 
    and ev.EventCompletionDate < dateadd(day,datediff(day,0,GETDATE()),0)
    and ev.EventStatusId = 3
    and ev.EventId in (201, 203)
    and c.Papersource =0--Offshore
    and ev.EventId=ett.EventID
    and arr.PaperId=c.PaperId
    and arr.EventId=ev.EventId
    and arr.EventId=ar.EventID
    and arr.ActionResultId=ar.ResultID
    and arr.ActionResultId in (1,2,3,4)
order by paperid, u.FirstName + ' ' + u.LastName

您需要仔细查看此查询的每一部分,并问自己,是否需要?

获取别名为 a 的子查询。

它连接了 6 个表,但如果您追踪到最终的选择子句,则仅从该别名提供 [Editor]。那么你需要6张桌子才能到达编辑器吗?不,你实际上不需要 2 tbl_WFPaperEventActionstbl_Users.此外,此子查询按 6 个项目(包括日期(分组,但其中 3 个项目在整个查询中的其他任何地方都没有使用 - 那么为什么要将这些项目包含在分组中呢?这允许我们删除 3 个连接的表。

在其余 3 个分组项目中,可以再替换 1 个,以避免tbl_WFPaperEventActionstbl_Papers之间的连接,因为连接条件为"wfce。PaperId = c.PaperId",我们只需要按wfce.PaperId分组而不是c.PaperId

最后,我们对该字段

感兴趣wfce.PaperEventActionNum该字段由子查询提供但不用于较大的查询?为什么提供该字段未使用?好吧,事实证明,它应该用于完成连接。别名为 需要的子查询 PaperEventActionNumPaperId 上都联接到外部查询中。顺便说一下,这也要求整个子查询需要向下推送连接结构以符合 ANSI 连接语法规则。

永远不要将 ANSI 联接语法与"老式方式"联接"混合">

这真的是一场灾难的秘诀。

下面我已经"开始"了对您的查询进行一些修改,但我无法真正完成它,因为我无法测试它的任何部分;而且我根本不知道您的数据模型。

就个人而言,我会从头开始此查询,从精益开始并逐项添加以确保它保持精益。

SELECT DISTINCT /* distinct isn't a good solution here */
      c.PaperID
    , cdd.CodesF
    , c.PageCount
    , prr.projectname
    , u.firstname + ' ' + u.lastname       AS Name
    , ett.EventName
    , CAST(c.AssignedDate AS date)         [AssignedDate]
    , CAST(ev.EventCompletionDate AS date) [CompletionDate]
    , ar.ResultDescription
    , a.Editor
FROM tbl_Papers c
      LEFT OUTER JOIN ( -- can this be an inner join instead?
                  SELECT
                        cd.PaperId
                      , COUNT(*) AS CodesF
                  FROM tbl_PaperCodes cd
                  GROUP BY
                        cd.PaperId
            ) cdd
                  ON cdd.PaperId = c.PaperId
      INNER JOIN tbl_Providers p                  ON c.ProviderId = p.ProviderId
      INNER JOIN tbl_Sites s                      ON p.SiteId = s.SiteId
      INNER JOIN tbl_Projects prr                 ON s.ProjectId = prr.ProjectId
      INNER JOIN tbl_WFPaperEvents ev             ON c.PaperId = ev.PaperId
      INNER JOIN tbl_Users u                      ON ev.EventCreateUserId = u.UserId
      INNER JOIN tbl_WFPaperEventTypes ett        ON ev.EventId = ett.EventID
      INNER JOIN tbl_WFPaperEventActions arr      ON c.PaperId = arr.PaperId
                                                            AND ev.EventId = arr.EventId
      INNER JOIN tbl_WFPaperEventActionResults ar ON arr.EventId = ar.EventID
                                                            AND arr.ActionResultId = ar.ResultID
                                                            AND arr.ActionResultId IN (1, 2, 3, 4)
      LEFT OUTER JOIN (
                  SELECT
                        wfce.PaperEventActionNum
                      , wfce.PaperId
                      --, c.PaperId
                      --, CONVERT(varchar, wfce.ActionDate, 101) CompletionDate -- cast to date here
                      --, pr.ProjectName
                      --, wfce.ActionUserId
                      , u.firstname + ' ' + u.lastname         [Editor]
                  FROM dbo.tbl_WFPaperEventActions wfce
                        --INNER JOIN dbo.tbl_Papers c  ON wfce.PaperId = c.PaperId
                        --INNER JOIN tbl_Providers p   ON p.ProviderID = c.ProviderID
                        --INNER JOIN tbl_Sites s       ON s.SiteID = p.SiteID
                        --INNER JOIN tbl_Projects pr   ON s.ProjectId = pr.ProjectId
tbl_Users                        INNER JOIN  u       ON wfce.ActionUserId = u.UserId
                  WHERE wfce.EventId = 204
                        AND c.Papersource = 0
                  GROUP BY
                        wfce.PaperEventActionNum
                      , wfce.PaperId
                      --, c.PaperId
                      --, CONVERT(varchar, wfce.ActionDate, 101)
                      --, pr.ProjectName
                      --, wfce.ActionUserId
                      , u.firstname + ' ' + u.lastname
            ) a
                  ON c.PaperId = a.PaperId AND arr.PaperEventActionNum = a.PaperEventActionNum

WHERE ev.EventCompletionDate >= DATEADD(DAY, DATEDIFF(DAY, 1, GETDATE()), 0)
      AND ev.EventCompletionDate < DATEADD(DAY, DATEDIFF(DAY, 0, GETDATE()), 0)
      AND ev.EventStatusId = 3
      AND ev.EventId IN (201, 203)
      AND c.Papersource = 0--Offshore

ORDER BY
      paperid, u.FirstName + ' ' + u.LastName

我真的很讨厌DISTINCT。这很讨厌。它不解决问题,它只是隐藏它们;并减慢一切以进行隐藏。

使用非重复与查询复杂度成反比:

  • 如果查询非常简单,则可以使用不同的查询
  • 如果查询很复杂,请不要使用非重复查询
  1. 检查有多少字段具有索引joinwheregroup by 子句。每个非索引字段都会对性能产生负面影响。

  2. GROUP BY 中的计算字段可能很痛苦,而且很独特(尤其是在它们没有编制索引的情况下(。 例如,对类似 u.ID 而不是 u.firstname+' '+u.lastnamepr.ProjectId I/O pr.ProjectName进行分组应该会让事情变得更快(如果需要,您可以根据其他条件对输出进行排序(。

  3. 你真的需要left join使用它吗? 即,即使另一端没有匹配项,您是否要保留连接另一侧的表?如果没有,请将其替换为 inner join

  4. 这里有各种小的改进,例如:

(假设 Papersource 和 EventId 是索引(:

    FROM 
        (SELECT * FROM dbo.tbl_WFPaperEventActions WHERE EventId = 204) wfce
        INNER JOIN
           (SELECT * FROM dbo.tbl_Papers WHERE Papersource = 0) c 
           ON wfce.PaperId = c.PaperId
        INNER JOIN tbl_Providers p ON p.ProviderID = c.ProviderID
        INNER JOIN tbl_Sites s ON s.SiteID = p.SiteID
        INNER JOIN tbl_Projects pr ON s.ProjectId=pr.ProjectId
        INNER JOIN tbl_Users u ON wfce.ActionUserId=u.UserId

而不是

    FROM 
        dbo.tbl_WFPaperEventActions wfce
        INNER JOIN dbo.tbl_Papers c ON wfce.PaperId = c.PaperId
        INNER JOIN tbl_Providers p ON p.ProviderID = c.ProviderID
        INNER JOIN tbl_Sites s ON s.SiteID = p.SiteID
        INNER JOIN tbl_Projects pr ON s.ProjectId=pr.ProjectId
        INNER JOIN tbl_Users u ON wfce.ActionUserId=u.UserId
    WHERE
        wfce.EventId = 204 
        AND c.Papersource =0

或(如果我正确理解了这个想法(:

and ev.EventCompletionDate BETWEEN (
    dateadd(day, -1, GETDATE()) and dateadd(ns, -1, GETDATE())

而不是:

and ev.EventCompletionDate >= dateadd(day,datediff(day,1,GETDATE()),0) 
and ev.EventCompletionDate < dateadd(day,datediff(day,0,GETDATE()),0)

一般来说:问问自己,你到底想通过这个查询实现什么,数据的哪些部分与它相关,有多少源表可以用它们的代码片段替换(这可以使 JOIN 更快地工作(,并尝试在使用 JOIN 和 WHERE 子句方面保持一致。

最新更新