如何优化下面显示的SQL查询



此查询是为那些在7月1日至7月31日期间未登录系统的用户编写的。

然而,当我们在查询分析器中运行查询时,它需要2分钟以上的时间。但在应用程序端给出错误为"执行超时已过期"。在完成操作或服务器没有响应之前经过的超时时间。

以下查询的开始日期为2022年7月1日,获取所有用户并将这些用户添加到名为"@table_temp"的临时表中,并增加到下一个日期。

同样,while循环在7月2日运行并获取用户,以此类推,直到7月31日。

有人能帮助使用CTE或任何其他机制优化查询吗?H如何避免While循环以获得更好的性能?

DECLARE @TABLE_TEMP TABLE
(
Row int IDENTITY(1,1),
[UserId] int,
[UserName] nvarchar(100),       
[StartDate] nvarchar(20),
[FirstLogin] nvarchar(20),
[LastLogout] nvarchar(20)       
)
DECLARE @START_DATE datetime = '2022-07-01';
DECLARE @END_DATE   datetime = '2022-07-31';
DECLARE @USER_ID nvarchar(max) = '1,2,3,4,5,6,7,8,9';
DECLARE @QUERY nvarchar(max) = '';
WHILE(@START_DATE < @END_DATE OR @START_DATE = @END_DATE)
BEGIN               
SET @QUERY = 'SELECT 
s.userid AS [UserId], 
s.username AS [UserName],
''' + CAST(@START_DATE as nvarchar)  + ''' AS [StartDate],
MAX(h.START_TIME) as [FirstLogin],
MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) as [LastLogout]                  
FROM USER s 
LEFT JOIN USER_LOGIN_HISTORY h ON h.userid = s.userid                                                         
LEFT JOIN TEMP_USER_INACTIVATION TUI ON TUI.userid = s.userid AND ('''+ CAST(@START_DATE as nvarchar)  +''' BETWEEN ACTIVATED_DATE AND DEACTIVATD_DATE)
WHERE s.userid IN (' + @USER_ID + ') 
AND h.userid  NOT IN (SELECT userid FROM USER_LOGIN_HISTORY WHERE CAST(START_TIME AS DATE)  = '''+ CONVERT(nvarchar,(CAST(@START_DATE AS DATE))) +''')                                                                                      AND ACTIVATED_DATE IS NOT NULL 
GROUP BY s.userid, h.userid, s.username, s.last_seen_time
HAVING CAST(MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) AS DATE) <>  '''+ CONVERT(nvarchar,(CAST(@START_DATE AS DATE)))  + '''
ORDER BY [User Name]'
INSERT INTO @TABLE_TEMP
EXEC(@QUERY)   
SET @START_DATE = DATEADD(DD, 1, @START_DATE)           
END

如果没有查询计划,很难说是肯定的。

但也有一些明显的效率需要提高。

首先,不需要WHILE循环。创建一个包含每个日期的Dates表。然后您可以简单地加入它。

此外,不要注入@USER_ID值。相反,将它们作为表值参数传递。至少,将您现在拥有的内容拆分为一个临时表或表变量。

不要强制转换要加入的值。例如,要检查START_TIME是否在某个日期,可以执行WHERE START_TIME >= BeginningOfDate AND START_TIME < BeginningOfNextDate

LEFT JOIN是可疑的,特别是考虑到您正在对WHERE中的那些表进行筛选。

使用NOT EXISTS而不是NOT IN,否则可能会得到错误的结果

DECLARE @START_DATE date = '2022-07-01';
DECLARE @END_DATE   date = '2022-07-31';
DECLARE @USER_ID nvarchar(max) = '1,2,3,4,5,6,7,8,9';
DECLARE @userIds TABLE (userId int PRIMARY KEY);
INSERT @userIds (userId)
SELECT CAST(value AS int)
FROM STRING_SPLIT(@USER_ID, ',');

SELECT 
s.userid as [UserId],
s.username as [UserName],
d.Date as [StartDate],
MAX(h.START_TIME) as [FirstLogin],
MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) as [LastLogout]
FROM Dates d
JOIN USER s 
LEFT JOIN USER_LOGIN_HISTORY h ON h.userid = s.userid                                                       
LEFT JOIN TEMP_USER_INACTIVATION TUI
ON TUI.userid = s.userid
ON d.Date BETWEEN ACTIVATED_DATE AND DEACTIVATD_DATE  -- specify table alias (don't know which?)
WHERE s.userid in (SELECT u.userId FROM @userIds u) 
AND NOT EXISTS (SELECT 1
FROM USER_LOGIN_HISTORY ulh
WHERE ulh.START_TIME >= CAST(d.date AS datetime)
AND ulh.START_TIME < CAST(DATEADD(day, 1, d.date) AS datetime)
AND ulh.userid = h.userid
)                                                                           
AND ACTIVATED_DATE IS NOT NULL
AND d.Date BETWEEN @START_DATE AND @END_DATE
GROUP BY
d.Date,
s.userid,
s.username,
s.last_seen_time
HAVING CAST(MAX(ISNULL(h.END_TIME, s.LAST_SEEN_TIME)) AS DATE) <> d.date
ORDER BY   -- do you need this? remove if possible.
s.username;

最好在表中收集日期,而不是在循环中运行查询。使用以下查询收集给定日期范围之间的日期:

DECLARE @day INT= 1
DECLARE @dates TABLE(datDate DATE)
--creates dates table first and then create dates for the given month.
WHILE ISDATE('2022-8-' + CAST(@day AS VARCHAR)) = 1
BEGIN
INSERT INTO @dates
VALUES      (DATEFROMPARTS(2022, 8, @day))
SET @day = @day + 1
END

然后要获得用户未登录的所有日期,您必须使用笛卡尔联接和左联接,如所示

SELECT    allDates.userID, 
allDates.userName, 
allDates.datDate notLoggedOn
FROM
(
--This will reutrun all users for all dates in a month i.e. 31 rows for august for every user
SELECT *
FROM   Users, 
@dates 
) allDates
LEFT JOIN
(
--now get last login date for every user between given date range
SELECT userID, 
MAX(login_date) last_Login_date
FROM   USER_LOGIN_HISTORY
WHERE  login_date BETWEEN '2022-08-01' AND '2022-08-31'
GROUP BY userID
) loggedDates ON loggedDates.last_Login_date = allDates.datDate
WHERE loggedDates.last_Login_date IS NULL --filter out only those users who have not logged in
ORDER BY allDates.userID, 
allDates.datDate

从这个查询中,你会得到一个月中用户没有登录的每一天

如果不需要列出用户未登录的每个日期,则可以省略笛卡尔联接。这将进一步提高性能

我希望这会有所帮助。

最新更新