SQL服务器,"Rank"或"Left join with own table"过滤最新记录



我在数据库中有以下表,大约有1000万条记录(将来可能会在1年内增加一倍):

create table PropertyOwners (
    [Key] int not null primary key,
    PropertyKey int not null, 
    BoughtDate DateTime, 
    OwnerKey int not null
)
go

上面的表包含了某个所有者在某个时间拥有的所有财产,我想获得在当前时间拥有超过一定数量的财产的所有者,假设一次拥有超过1000个财产。我写了两个不同的查询,一个使用"Rank",另一个使用"Left join with own table"。

使用Rank(大约4秒):

select OwnerKey, COUNT(1) PropertyCount 
from (
    select PropertyKey, OwnerKey, BoughtDate,
        RANK() over (partition by PropertyKey order by BoughtDate desc) as [Rank]
    from dbo.PropertyOwners 
) result
where [Rank]=1
group by OwnerKey
having COUNT(1)>1000

对同一个表使用左连接(耗时约10秒):

select OwnerKey, COUNT(1) PropertyCount 
from (
    select po.PropertyKey, po.OwnerKey, po.BoughtDate
    from dbo.PropertyOwners po
    left join dbo.PropertyOwners lo on lo.PropertyKey = po.PropertyKey
    and lo.BoughtDate > po.BoughtDate
    where lo.PropertyKey is null
) result
group by OwnerKey
having COUNT(1)>1000

这两个查询时间都是不可接受的,因为花费了这么多时间,谁能帮我重写查询?我的表有以下索引:

CREATE NONCLUSTERED INDEX [IX_PropertyKey_BounghtDate] ON [dbo].[PropertyOwners] 
(
    [PropertyKey] ASC,
    [BoughtDate] DESC
)
INCLUDE ( [OwnerKey]) WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO

您可以将其重写为(这可能会提高性能)

select OwnerKey, COUNT(1) PropertyCount 
from (
    select PropertyKey, MAX( BoughtDate) BoughtDate
    from dbo.PropertyOwners 
    Group by PropertyKey
) result INNER JOIN dbo.PropertyOwners po ON po.PropertyKey=result.PropertyKey and PO.boughtDate=result.boughtdate
group by OwnerKey
having COUNT(1)>1000

您有相当数量的数据,而且有很多数据需要计算。Aaron Bertrand的分析并不完全是你的问题,但它可能会对你有所帮助。

有了您的支持索引,我建议您尝试not exists方法:

select OwnerKey, count(*) as PropertyCount
from PropertyOwners po
where not exists (select 1
                  from PropertyOwners po2
                  where po2.PropertyKey = po.PropertyKey and
                        po2.BoughtDate > po.BoughtDate
                 )
group by OwnerKey
having count(*) > 1000;

如果你不能让查询足够快地工作,你可能需要升级你的硬件或使用触发器来保持汇总表最新

分组从来都不快。如果你运行查询的次数足够多,你可以查看SQL Server会建议的索引;谷歌诊断查询涉及sys.dm_db_index_usage_stats,他们会有一些帮助。

另一个选项,前面已经建议过,是构建汇总表。稍微轻量级一些的解决方案是索引视图,但是您必须理解创建它时将会产生的影响。

最新更新