我有一个表,其中包含Age
、Period
和Year
列。列Age
始终以 0 开头,并且没有固定的最大值(我在此示例中使用了"Age"0 到 30,但范围也可能是 0 到 100 等),值Period
和Year
仅在特定年龄出现在某些行中。
但是,Period
和Year
的值Age
出现时,变化和解决方案应该是动态的。用正确的Period
和Year
填充NULL
值的最佳方法是什么?
我正在使用 SQL Server。
Age Period Year
-----------------
0 NULL NULL
1 NULL NULL
2 NULL NULL
3 NULL NULL
4 NULL NULL
5 NULL NULL
6 NULL NULL
7 NULL NULL
8 NULL NULL
9 NULL NULL
10 NULL NULL
11 NULL NULL
12 NULL NULL
13 NULL NULL
14 NULL NULL
15 NULL NULL
16 NULL NULL
17 NULL NULL
18 NULL NULL
19 NULL NULL
20 NULL NULL
21 46 2065
22 NULL NULL
23 NULL NULL
24 NULL NULL
25 NULL NULL
26 51 2070
27 NULL NULL
28 NULL NULL
29 NULL NULL
30 NULL NULL
结果应如下所示,Period
和Year
的数字应从Period
和Year
的最后一个已知值增加和/或减少。
Age Period Year
-----------------
0 25 2044
1 26 2045
2 27 2046
3 28 2047
4 29 2048
5 30 2049
6 31 2050
7 32 2051
8 33 2052
9 34 2053
10 35 2054
11 36 2055
12 37 2056
13 38 2057
14 39 2058
15 40 2059
16 41 2060
17 42 2061
18 43 2062
19 44 2063
20 45 2064
21 46 2065
22 47 2066
23 48 2067
24 49 2068
25 50 2069
26 51 2070
27 52 2071
28 53 2072
29 54 2073
30 55 2074
这是我问题的更新,因为我没有足够详细地指定我的要求: 该解决方案应该能够处理Age
、Period
和Year
的不同组合。我的起点永远是一个已知的Age
、Period
和Year
的组合。但是,在我的示例中,组合Age
= 21、Period
= 46 和Year
= 2065(或 26|51|2070 作为第二个组合)不是静态的。Age
= 21 处的值可以是任何值,例如Period
= 2,Year
= 2021。无论组合是什么(Age
、Period
、Year
),解都应该填补空白,并完成序列从已知值上下计数Period
和Year
。如果Period
值序列变为负数,则解决方案应返回NULL
值(如果可能)。
似乎你的年龄和年份总是有相同的增量 所以
select age, isnull(period,age +25) Period, isnull(year,age+44) year
from yourtable
或标准函数合并(如戈登·林诺夫所建议的那样)
select age, coalesce(period,age +25) Period, coalesce(year,age+44) year
from yourtable
Tabel 创建代码
create table yourtable ( AGE int , Period int, Year int )
insert into yourtable
Select 0 AS AGE , null As Period , null As Year UNION all
Select 1 AS AGE , null As Period , null As Year UNION all
Select 2 AS AGE , null As Period , null As Year UNION all
Select 3 AS AGE , null As Period , null As Year UNION all
Select 4 AS AGE , null As Period , null As Year UNION all
Select 5 AS AGE , null As Period , null As Year UNION all
Select 6 AS AGE , null As Period , null As Year UNION all
Select 7 AS AGE , null As Period , null As Year UNION all
Select 8 AS AGE , null As Period , null As Year UNION all
Select 9 AS AGE , null As Period , null As Year UNION all
Select 10 AS AGE , null As Period , null As Year UNION all
Select 11 AS AGE , null As Period , null As Year UNION all
Select 12 AS AGE , null As Period , null As Year UNION all
Select 13 AS AGE , null As Period , null As Year UNION all
Select 14 AS AGE , null As Period , null As Year UNION all
Select 15 AS AGE , null As Period , null As Year UNION all
Select 16 AS AGE , null As Period , null As Year UNION all
Select 17 AS AGE , null As Period , null As Year UNION all
Select 18 AS AGE , null As Period , null As Year UNION all
Select 19 AS AGE , null As Period , null As Year UNION all
Select 20 AS AGE , null As Period , null As Year UNION all
Select 21 AS AGE ,46 As Period ,2065 As Year UNION all
Select 22 AS AGE , null As Period , null As Year UNION all
Select 23 AS AGE , null As Period , null As Year UNION all
Select 24 AS AGE , null As Period , null As Year UNION all
Select 25 AS AGE , 51 As Period ,2070 As Year UNION all
Select 26 AS AGE , null As Period , null As Year UNION all
Select 27 AS AGE , null As Period , null As Year UNION all
Select 28 AS AGE , null As Period , null As Year UNION all
Select 29 AS AGE , null As Period , null As Year UNION all
Select 30 AS AGE , null As Period , null As Year
**步骤**
- 我们需要为期间和年份获取一行具有非空值的行。
- 使用年龄获取两个列的第一个值。
- 现在只需添加相应的年龄列值并填充整个表。
修复串行的代码
;with tmp as
(select top 1 * from yourtable where Period is not null and year is not null)
update yourtable
set Period = (tmp.Period - tmp.age) + yourtable.age
, year = (tmp.year - tmp.age) + yourtable.age
from yourtable , tmp
或
Declare @age int ,@Year int ,@Period int
select @age = age , @Year = year - (age +1) ,@Period = Period- (AGE +1)
from yourtable where Period is not null and year is not null
update yourtable
set Period =@Period + age
,Year =@year + age
from yourtable
您最终需要三个具有不同起始值的序列。然后,您只需计算偏移量并将其添加到age
:
with cte as
(
select age
,max(period - age) over () + age as period -- adjusted period
,max(yr - age) over () + age as yr -- adjusted yr
from #yourtable
)
select age
-- If a Period value sequence becomes negative the solutions should return NULL
,case when period >0 then period end as period
,yr
from cte
见小提琴
-- 希望你能管理语法错误。 但是在这种情况下,下面给出的一些逻辑应该有效,我们可以将周期作为原点来计算其他缺失值。 祝你好运!
declare @knownperiod int;
declare @knownperiodage int;
declare @agetop int;
declare @agebottom int;
@knownperiod = select top 1 period from table1 where period is not null
@knownperiodage = select top 1 age from table1 where period is not null
while(@knownperiodage >= 0)
begin
@knownperiod = @knownperiod -1 ;
@knownperiodage = @knownperiodage -1;
update table1 set period = @knownperiod, year = YEAR(GetDate())+@knownperiod-1 where age = @knownperiodage
end
-- now for bottom age
@knownperiod = select top 1 period from table1 where period is null or year is null
@knownperiodage = select top 1 age from table1 where period is null or year is null
while(@knownperiodage <= (Select max(age) from table1))
begin
@knownperiod = @knownperiod +1 ;
@knownperiodage = @knownperiodage +1;
update table1 set period = @knownperiod, year = YEAR(GetDate())+@knownperiod-1 where age = @knownperiodage
end
过程是否首先计算增量(年龄 -> 周期和年龄 -> 年),然后简单地将这些增量添加到年龄值? 这假设年龄和时期以及年龄和年份之间的差异在行之间是一致的(只是有时没有填写)。
因此,您可以使用以下内容首先计算增量(PeriodInc,YrInc),然后选择添加增量的值(请注意,如果周期变为负数,则为NULL)。
; WITH PeriodInc AS (SELECT TOP 1 Period - Age AS PeriodInc FROM #yourtable WHERE Period IS NOT NULL),
YrInc AS (SELECT TOP 1 Yr - Age AS YrInc FROM #yourtable WHERE Yr IS NOT NULL)
SELECT Age,
CASE WHEN (Age + PeriodInc) >= 0 THEN (Age + PeriodInc) ELSE NULL END AS Period,
Age + YrInc AS Yr
FROM #yourtable
CROSS JOIN PeriodInc
CROSS JOIN YrInc
这是带有代码的DB_Fiddle
此解决方案需要 4 个输入:
- @list_length -- (整数)要生成的行数(最多 12^5=248,832)
- @start_age -- (整数)起始年龄
- @start_period -- (整数)开始周期
- @start_year -- (整数)起始年份
对于输入的任意组合,此代码生成请求的输出。 如果年龄或年份计算为负数,则将其转换为 NULL。 目前对列表长度的限制可以增加到必要的任何限制。 众所周知,在生成大序列时,使用交叉应用行创建row_number的技术非常快。 在大约 500 行以上,它总是比基于递归的 CTE 更快。 在较小的行数下,这两种技术之间的性能差异很小甚至没有差异。
下面是与示例数据匹配的代码和输出。
输入
declare
@list_length int=31,
@start_age int=21,
@start_period int=46,
@start_year int=2065;
法典
with
n(n) as (select * from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(n)),
tally_cte(n) as (
select row_number() over (order by (select null))
from n n1 cross join n n2 cross join n n3 cross join n n4 cross join n n5)
select p.Age,
case when p.[Period]<0 then null else p.[Period] end [Period],
case when p.[Year]<0 then null else p.[Year] end [Year]
from tally_cte t
cross apply
(select (t.n-1) [Age], (t.n-1)+(@start_period-@start_age) [Period],
(t.n-1)+(@start_year-@start_age) [Year]) p
where n<=@list_length;
输出
Age Period Year
0 25 2044
1 26 2045
2 27 2046
3 28 2047
4 29 2048
5 30 2049
6 31 2050
7 32 2051
8 33 2052
9 34 2053
10 35 2054
11 36 2055
12 37 2056
13 38 2057
14 39 2058
15 40 2059
16 41 2060
17 42 2061
18 43 2062
19 44 2063
20 45 2064
21 46 2065
22 47 2066
23 48 2067
24 49 2068
25 50 2069
26 51 2070
27 52 2071
28 53 2072
29 54 2073
30 55 2074
假设期间和年份都小于起始年龄。 当计算值为负数时,该值将替换为 NULL。
输入
declare
@list_length int=100,
@start_age int=10,
@start_period int=5,
@start_year int=8;
输出
Age Period Year
0 NULL NULL
1 NULL NULL
2 NULL 0
3 NULL 1
4 NULL 2
5 0 3
6 1 4
7 2 5
8 3 6
9 4 7
10 5 8
11 6 9
12 7 10
...
99 94 97
Imo 这是满足所有要求的灵活有效的方法。 如果有任何问题,请告诉我。
这读起来就像一个间隙和孤岛问题,其中"空"行是间隙,非空行是孤岛。
你想填补空白。您的问题有点棘手,因为您没有清楚地描述当间隙行同时具有前后岛时如何进行 - 以及如果它们不一致该怎么办。
让我假设您希望从下一个岛(如果有可用)派生值,并回退到前面的岛。
下面是使用横向连接来检索下一个和前面的非空行的方法:
select t.age,
coalesce(t.period, n.period - n.diff, p.period - p.diff) period,
coalesce(t.year, n.year - n.diff, p.year - p.diff) year
from mytable t
outer apply (
select top (1) t1.*, t1.age - t.age diff
from mytable t1
where t1.age > t.age and t1.period is not null and t1.year is not null
order by t1.age
) n
outer apply (
select top (1) t1.*, t1.age - t.age diff
from mytable t1
where t1.age < t.age and t1.period is not null and t1.year is not null
order by t1.age desc
) p
order by t.age
实际上,使用窗口函数可能会更有效地执行此操作。我们可以通过构建具有窗口计数的记录组来实现相同的逻辑,然后在组中进行计算:
select
age,
coalesce(
period,
max(period) over(partition by grp2) - max(age) over(partition by grp2) + age,
max(period) over(partition by grp1) - min(age) over(partition by grp1) + age
) period,
coalesce(
year,
max(year) over(partition by grp2) - max(age) over(partition by grp2) + age,
max(year) over(partition by grp1) - min(age) over(partition by grp1) + age
) year
from (
select t.*,
count(period) over(order by age) grp1,
count(period) over(order by age desc) grp2
from mytable t
) t
order by age
DB Fiddle 上的演示- 两个查询都会产生:
年龄 | 期间 | 年份 --: |-----: |---: 0 | 25 |2044 1 | 26 |2045 2 | 27 |2046 3 | 28 |2047 4 | 29 |2048 5 | 30 |2049 6 | 31 |2050 7 | 32 |2051 8 | 33 |2052 9 | 34 |2053 10 | 35 |2054 11 | 36 |2055 12 | 37 |2056 13 | 38 |2057 14 | 39 |2058 15 | 40 |2059 16 | 41 |2060 17 | 42 |2061 18 | 43 |2062 19 | 44 |2063 20 | 45 |2064 21 | 46 |2065 22 | 47 |2066 23 | 48 |2067 24 | 49 |2068 25 | 50 |2069 26 | 51 |2070 27 | 52 |2071 28 | 53 |2072 29 | 54 |2073 30 | 55 |2074
您也可以使用递归 CTE(它可以处理表中数据的任何变体,除了一个根本没有填充周期和年份的数据):
WITH cte AS ( -- get any filled period and year
SELECT TOP 1 period - age delta,
[year]-period start_year
FROM tablename
WHERE period is not null and [year] is not null
), seq AS ( --get min and max age values
SELECT MIN(age) as min_age, MAX(age) as max_age
FROM tablename
), go_recursive AS (
SELECT min_age age,
min_age+delta period ,
start_year+min_age+delta year,
max_age
FROM seq
CROSS JOIN cte --That will generate the initial first row
UNION ALL
SELECT age + 1,
period +1,
year + 1,
max_age
FROM go_recursive
WHERE age < max_age --This part increments the data from first row
)
SELECT age,
period,
[year]
FROM go_recursive
OPTION (MAXRECURSION 0)
-- If you know there are some limit of rows in that kind of tables
--use this row count instead 0