给定一个表Orders
,其列为:
id | revision | insertedAt
1 0 2016-01-01 00:00.000
1 1 2016-01-01 02:00.000
2 0 2016-01-01 02:00.000
其中id、版本组合是唯一的
我怎样才能最好地迁移到这个:
id | revision | applyFrom | applyTo
1 0 2016-01-01 00:00.000 2016-01-01 01:99.999
1 1 2016-01-01 02:00.000 9999-31-12 00:00.000
2 0 2016-01-01 02:00.000 9999-31-12 00:00.000
我已经尝试过迭代CURSOR
并在我进行的过程中更新。
UPDATE orders SET applyFrom = @newApplyFrom, applyTo = @newApplyTo
WHERE id = @id AND revision = @revision;
但是对于2.26亿行,即使达到索引,估计运行时间也接近60小时。
是否有更快的方法达到相同的结果?我可以根据需要添加索引。目前,在(id, revision)上有一个聚类索引
你可以这样更新:我使用铅和显示与选择
;with cte as (
select *, lead(insertedAt,1,'9999-12-31 00:00.000') over(order by id) migdate from Orders
)
select *, case when insertedAt = migdate then '9999-12-31 00:00.000' else DATEADD(S, -1, migdate) end as applyto from cte
这是一个包含LEAD
和自连接的版本。不确定在大数据集上的性能,但我已经包含了批处理,以防万一。
WITH cte AS (
SELECT
id,
revision,
insertedAt,
applyFrom,
applyTo,
LEAD(insertedAt) OVER (PARTITION BY id ORDER BY id, revision) AS newApplyTo
FROM orders
)
UPDATE TOP (@BatchSize) o SET
applyFrom = o.insertedAt,
applyTo = ISNULL(DATEADD(s, -1, o.newApplyTo), '9999-12-31')
FROM cte o
WHERE
o.applyFrom IS NULL AND
o.applyTo IS NULL;
我使用的数据集(带结果)是:
Id revision insertedAt applyFrom applyTo
----------- ----------- --------------------------- --------------------------- ---------------------------
1 0 2016-01-01 00:00:00.0000000 2016-01-01 00:00:00.0000000 2016-01-01 01:59:59.0000000
1 1 2016-01-01 02:00:00.0000000 2016-01-01 02:00:00.0000000 9999-12-31 00:00:00.0000000
2 0 2016-01-01 02:00:00.0000000 2016-01-01 02:00:00.0000000 9999-12-31 00:00:00.0000000
3 0 2016-01-01 00:00:00.0000000 2016-01-01 00:00:00.0000000 2016-10-31 23:59:59.0000000
3 1 2016-11-01 00:00:00.0000000 2016-11-01 00:00:00.0000000 2016-11-30 23:59:59.0000000
3 2 2016-12-01 00:00:00.0000000 2016-12-01 00:00:00.0000000 9999-12-31 00:00:00.0000000