在SQL Server中加快划分日期间隔



我需要将某些日期间隔除以几分钟。(例如,2012-01-01 10:00-2012-01-01 10:00间隔应分为2012-01-01 10:01,2012-12-01-01 10:02,... 2012-01-01 10:10)。例如,有一个表

CREATE TABLE [dbo].[Events](
    [ID] [int] IDENTITY(1,1) NOT NULL,
    [EventStart] [datetime] NOT NULL,
    [EventEnd] [datetime] NOT NULL,
    [Amount] [float] NOT NULL,
 CONSTRAINT [PK_Events] PRIMARY KEY CLUSTERED 
(
    [ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO

这个表填充了

DECLARE @i integer = 0;
DECLARE @initial_date datetime = '2012-01-01';
WHILE @i < 50000
BEGIN
    INSERT INTO [Events] (EventStart, EventEnd, Amount) VALUES (DATEADD(MINUTE, 10*@i, @initial_date), DATEADD(MINUTE, 10*(@i + 1), @initial_date), @i);
    SET @i = @i + 1;
END

因此,我们有很多10分钟的间隔。

将其除以几分钟,我使用以下递归CTE:

DECLARE @start_date datetime = '2012-01-01';
DECLARE @end_date datetime = '2013-01-02';

WITH Date_Ranges (StatDate, Amount, IntervalStart, CurrentMinute) AS (
  SELECT 
    DATEADD(MINUTE, 0,  ev.EventStart) AS StatDate, ev.Amount, ev.EventStart AS IntervalStart, 1 AS CurrentMinute
  FROM [Events] ev
  WHERE ev.EventStart BETWEEN @start_date AND @end_date
  UNION ALL
  SELECT 
    DATEADD(MINUTE, CurrentMinute, ev.EventStart), ev.Amount, ev.EventStart AS IntervalStart, CurrentMinute + 1
  FROM [Events] ev
  INNER JOIN Date_Ranges ranges ON (ranges.IntervalStart = ev.EventStart AND 
    ranges.StatDate >= ev.EventStart AND 
    ranges.StatDate < ev.EventEnd)
    WHERE DATEADD(MINUTE, CurrentMinute, ev.EventStart) BETWEEN @start_date AND @end_date AND
        ev.EventStart BETWEEN @start_date AND @end_date
) 
SELECT *
FROM Date_Ranges --ORDER BY StatDate

主要问题是该递归CTE的执行太慢。

那么,我该如何加快速度?

这大约是递归cte的时间的所有550,000行。

DECLARE @start_date datetime = '2012-01-01'; 
DECLARE @end_date datetime = '2013-01-02';
SELECT  DATEADD(MINUTE, x.number, ev.EventStart) AS StartDate, 
        ev.Amount, 
        ev.EventStart as IntervalStart, 
        x.number as CurrentMinute
FROM    master.dbo.spt_values x
CROSS JOIN Events ev
WHERE   x.type = 'P'        
AND     x.number <= DATEDIFF(MINUTE, ev.EventStart, ev.EventEnd)
AND     ev.EventStart BETWEEN @start_date and @end_date

我认为最快的存储桶将是一个表。创建一个10分钟的桶,填充它,然后在其上加入。这完全避免了递归,并利用了SQL DBM的一件事,真的很擅长 - Joins。

10年的10分钟桶仅为50万行。

当您处理诸如材料清单之类的东西时,

cte中的递归是一件好事。但这并不总是适合桌子的替代品。


我创建了一张覆盖10年的10分钟水桶的桌子。(这是大约4兆字节的数据。我没有尝试计算多少磁盘索引和行架空开销。)然后,我创建了一个包含2000万个随机时间戳的测试数据表,所有这些都与桶的表相同的10年内。

添加适合该问题的索引后,测试系统在大约100ms中"随机数据"随机数据"。(PostgreSQL DBMS不进行调整,在一台5岁的Dell计算机上运行,带有1次RAM。我在这里的Linux系统,所以我无法测试SQL Server本身。但是,我希望结果类似。)

最新更新