带有开始和结束日期的 SQL 查询 - 最佳选择是什么 - SQL query with start and end dates

我在工作中使用MS SQL Server 2005来构建数据库。有人告诉我，大多数表在构建后不久将容纳 1,000,000 到 500,000,000 行数据......我没有处理过这么大的数据集。大多数时候，我什至不知道我应该考虑什么来找出设置模式、查询等方法的最佳答案。

所以。。。我需要知道某事的开始和结束日期以及与该时间范围内 ID 关联的值。所以。。。我们可以用两种不同的方式将表格整理出来：

create table xxx_test2 (id int identity(1,1), groupid int, dt datetime, i int) 
create table xxx_test2 (id int identity(1,1), groupid int, start_dt datetime, end_dt datetime, i int)

哪个更好？如何更好地定义？我用大约 100,000 行数据填充了第一个表，根据查询的不同，以第二个表的格式设置大约需要 10-12 秒......

    select  y.groupid,
            y.dt as [start], 
            z.dt as [end],   
            (case when z.dt is null then 1 else 0 end) as latest, 
            y.i 
    from    #x as y 
            outer apply (select top 1 * 
                            from    #x as x 
                            where   x.groupid = y.groupid and 
                                    x.dt > y.dt 
                            order by x.dt asc) as z

或
http://consultingblogs.emc.com/jamiethomson/archive/2005/01/10/t-sql-deriving-start-and-end-date-from-a-single-effective-date.aspx

呜...与第二张桌子....要插入新行，我必须去查看是否有前一行，如果是，请更新其结束日期。所以。。。检索数据与插入/更新内容时是性能问题吗？两次存储结束日期似乎很愚蠢，但也许......不？我应该看什么东西？

这就是我用来生成假数据的方法...如果你出于某种原因想玩它（如果你将随机数的最大值更改为更高的值，它会更快地生成假东西）：

declare @dt datetime
declare @i int
declare @id int
set @id = 1
declare @rowcount int
set @rowcount = 0
declare @numrows int 
while (@rowcount<100000)
begin
set @i = 1
set @dt = getdate()
set @numrows = Cast(((5 + 1) - 1) * 
                Rand() + 1 As tinyint)
while @i<=@numrows
    begin
    insert into #x values (@id, dateadd(d,@i,@dt), @i)
    set @i = @i + 1
    end 
set @rowcount = @rowcount + @numrows
set @id = @id + 1
print @rowcount
end

出于您的目的，我认为选项 2 是表格设计的方法。这为您提供了灵活性，并为您节省了大量工作。

具有生效日期和结束日期将允许您拥有一个查询，该查询将仅通过在 where 子句中包含以下内容来返回当前有效的数据：

where sysdate between effectivedate and enddate

然后，您还可以使用它以时间敏感的方式与其他表联接。

只要正确设置了密钥并提供正确的索引，性能（至少在此表上）应该不是问题。

对于可以使用SQL Server 2012（或Oracle，DB2等）的LEAD分析功能的任何人来说，从第一个表（仅使用1个日期列）检索数据将比没有此功能快得多：

select
  groupid,
  dt "start",
  lead(dt) over (partition by groupid order by dt) "end",
  case when lead(dt) over (partition by groupid order by dt) is null
       then 1 else 0 end "latest",
  i
from x

带有开始和结束日期的 SQL 查询 - 最佳选择是什么

相关内容

最新更新

热门标签：