在 SQL Server 中按年份顺序填充空白



我有一个表,其中包含AgePeriodYear列。列Age始终以 0 开头,并且没有固定的最大值(我在此示例中使用了"Age"0 到 30,但范围也可能是 0 到 100 等),值PeriodYear仅在特定年龄出现在某些行中。

但是,PeriodYear的值Age出现时,变化和解决方案应该是动态的。用正确的PeriodYear填充NULL值的最佳方法是什么?

我正在使用 SQL Server。

Age Period  Year
-----------------
0   NULL    NULL
1   NULL    NULL
2   NULL    NULL
3   NULL    NULL
4   NULL    NULL
5   NULL    NULL
6   NULL    NULL
7   NULL    NULL
8   NULL    NULL
9   NULL    NULL
10  NULL    NULL
11  NULL    NULL
12  NULL    NULL
13  NULL    NULL
14  NULL    NULL
15  NULL    NULL
16  NULL    NULL
17  NULL    NULL
18  NULL    NULL
19  NULL    NULL
20  NULL    NULL
21  46      2065
22  NULL    NULL
23  NULL    NULL
24  NULL    NULL
25  NULL    NULL
26  51      2070
27  NULL    NULL
28  NULL    NULL
29  NULL    NULL
30  NULL    NULL

结果应如下所示,PeriodYear的数字应从PeriodYear的最后一个已知值增加和/或减少。

Age Period  Year      
-----------------
0   25      2044
1   26      2045
2   27      2046
3   28      2047
4   29      2048
5   30      2049
6   31      2050
7   32      2051
8   33      2052
9   34      2053
10  35      2054
11  36      2055
12  37      2056
13  38      2057
14  39      2058
15  40      2059
16  41      2060
17  42      2061
18  43      2062
19  44      2063
20  45      2064
21  46      2065
22  47      2066
23  48      2067
24  49      2068
25  50      2069
26  51      2070
27  52      2071
28  53      2072
29  54      2073
30  55      2074

这是我问题的更新,因为我没有足够详细地指定我的要求: 该解决方案应该能够处理AgePeriodYear的不同组合。我的起点永远是一个已知的AgePeriodYear的组合。但是,在我的示例中,组合Age= 21、Period= 46 和Year= 2065(或 26|51|2070 作为第二个组合)不是静态的。Age= 21 处的值可以是任何值,例如Period= 2,Year= 2021。无论组合是什么(AgePeriodYear),解都应该填补空白,并完成序列从已知值上下计数PeriodYear。如果Period值序列变为负数,则解决方案应返回NULL值(如果可能)。

似乎你的年龄和年份总是有相同的增量 所以

select age, isnull(period,age +25) Period,  isnull(year,age+44) year  
from yourtable 

或标准函数合并(如戈登·林诺夫所建议的那样)

select age, coalesce(period,age +25) Period,  coalesce(year,age+44) year  
from yourtable 

Tabel 创建代码

create table yourtable ( AGE int ,  Period  int, Year int )
insert into  yourtable
Select  0    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  1    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  2    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  3    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  4    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  5    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  6    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  7    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  8    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  9    AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  10   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  11   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  12   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  13   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  14   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  15   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  16   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  17   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  18   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  19   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  20   AS AGE , null  As Period   ,  null  As Year       UNION all 
Select  21   AS AGE ,46 As Period   ,2065    As Year       UNION all 
Select  22   AS AGE ,  null As Period   ,  null  As Year       UNION all 
Select  23   AS AGE ,  null As Period   ,  null  As Year       UNION all 
Select  24   AS AGE ,  null As Period   ,  null  As Year       UNION all 
Select  25   AS AGE ,   51  As Period   ,2070    As Year       UNION all 
Select  26   AS AGE ,  null As Period   ,  null  As Year       UNION all 
Select  27   AS AGE ,  null As Period   ,  null  As Year       UNION all 
Select  28   AS AGE ,  null As Period   ,  null  As Year       UNION all 
Select  29   AS AGE ,  null As Period   ,  null  As Year       UNION all 
Select  30   AS AGE ,  null As Period   ,  null  As Year      

**步骤**

  • 我们需要为期间和年份获取一行具有非空值的行。
  • 使用年龄获取两个列的第一个值。
  • 现在只需添加相应的年龄列值并填充整个表。

修复串行的代码

;with tmp as 
(select top 1 * from yourtable  where  Period is not null and  year is not null)
update yourtable
set Period =  (tmp.Period - tmp.age) + yourtable.age
, year =  (tmp.year - tmp.age) + yourtable.age
from yourtable , tmp

Declare @age int  ,@Year int  ,@Period int 
select @age = age , @Year = year - (age +1)  ,@Period  = Period- (AGE +1)
from yourtable where  Period is not null and  year is not null
update  yourtable
set Period =@Period + age
,Year =@year + age
from yourtable 

您最终需要三个具有不同起始值的序列。然后,您只需计算偏移量并将其添加到age

with cte as
(
select age
,max(period - age) over () + age as period -- adjusted period
,max(yr - age)     over () + age as yr     -- adjusted yr
from #yourtable
)
select age
-- If a Period value sequence becomes negative the solutions should return NULL
,case when period >0 then period end as period
,yr
from cte

见小提琴

-- 希望你能管理语法错误。 但是在这种情况下,下面给出的一些逻辑应该有效,我们可以将周期作为原点来计算其他缺失值。 祝你好运!

declare @knownperiod int;
declare @knownperiodage int;
declare @agetop int;
declare @agebottom int;
@knownperiod = select top 1 period from table1 where period is not null
@knownperiodage = select top 1 age from table1 where period is not null

while(@knownperiodage >= 0)
begin 
@knownperiod = @knownperiod -1 ;
@knownperiodage = @knownperiodage -1;
update table1 set period = @knownperiod, year = YEAR(GetDate())+@knownperiod-1  where age = @knownperiodage
end
-- now for bottom age 
@knownperiod = select top 1 period from table1 where period is null or year is null
@knownperiodage = select top 1 age from table1 where period is null or year is null
while(@knownperiodage <= (Select max(age) from table1))
begin 
@knownperiod = @knownperiod +1 ;
@knownperiodage = @knownperiodage +1;
update table1 set period = @knownperiod, year = YEAR(GetDate())+@knownperiod-1  where age = @knownperiodage
end

过程是否首先计算增量(年龄 -> 周期和年龄 -> 年),然后简单地将这些增量添加到年龄值? 这假设年龄和时期以及年龄和年份之间的差异在行之间是一致的(只是有时没有填写)。

因此,您可以使用以下内容首先计算增量(PeriodInc,YrInc),然后选择添加增量的值(请注意,如果周期变为负数,则为NULL)。

; WITH  PeriodInc AS (SELECT TOP 1 Period - Age AS PeriodInc FROM #yourtable WHERE Period IS NOT NULL),
YrInc AS (SELECT TOP 1 Yr - Age AS YrInc FROM #yourtable WHERE Yr IS NOT NULL)
SELECT      Age, 
CASE WHEN (Age + PeriodInc) >= 0 THEN (Age + PeriodInc) ELSE NULL END AS Period, 
Age + YrInc AS Yr
FROM    #yourtable
CROSS JOIN PeriodInc
CROSS JOIN YrInc

这是带有代码的DB_Fiddle

此解决方案需要 4 个输入:

  1. @list_length -- (整数)要生成的行数(最多 12^5=248,832)
  2. @start_age -- (整数)起始年龄
  3. @start_period -- (整数)开始周期
  4. @start_year -- (整数)起始年份

对于输入的任意组合,此代码生成请求的输出。 如果年龄或年份计算为负数,则将其转换为 NULL。 目前对列表长度的限制可以增加到必要的任何限制。 众所周知,在生成大序列时,使用交叉应用行创建row_number的技术非常快。 在大约 500 行以上,它总是比基于递归的 CTE 更快。 在较小的行数下,这两种技术之间的性能差异很小甚至没有差异。

下面是与示例数据匹配的代码和输出。

输入

declare
@list_length            int=31,
@start_age              int=21,
@start_period           int=46,
@start_year             int=2065;

法典

with
n(n) as (select * from (values (1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)) v(n)),
tally_cte(n) as (
select row_number() over (order by (select null)) 
from n n1 cross join n n2 cross join n n3 cross join n n4 cross join n n5)
select p.Age, 
case when p.[Period]<0 then null else p.[Period] end [Period], 
case when p.[Year]<0 then null else p.[Year] end [Year]
from tally_cte t
cross apply
(select (t.n-1) [Age], (t.n-1)+(@start_period-@start_age) [Period],
(t.n-1)+(@start_year-@start_age) [Year]) p
where n<=@list_length;

输出

Age Period  Year
0   25  2044
1   26  2045
2   27  2046
3   28  2047
4   29  2048
5   30  2049
6   31  2050
7   32  2051
8   33  2052
9   34  2053
10  35  2054
11  36  2055
12  37  2056
13  38  2057
14  39  2058
15  40  2059
16  41  2060
17  42  2061
18  43  2062
19  44  2063
20  45  2064
21  46  2065
22  47  2066
23  48  2067
24  49  2068
25  50  2069
26  51  2070
27  52  2071
28  53  2072
29  54  2073
30  55  2074

假设期间和年份都小于起始年龄。 当计算值为负数时,该值将替换为 NULL。

输入

declare
@list_length            int=100,
@start_age              int=10,
@start_period           int=5,
@start_year             int=8;

输出

Age Period  Year
0   NULL    NULL
1   NULL    NULL
2   NULL    0
3   NULL    1
4   NULL    2
5   0       3
6   1       4
7   2       5
8   3       6
9   4       7
10  5       8
11  6       9
12  7       10
...
99  94      97

Imo 这是满足所有要求的灵活有效的方法。 如果有任何问题,请告诉我。

这读起来就像一个间隙和孤岛问题,其中"空"行是间隙,非空行是孤岛。

你想填补空白。您的问题有点棘手,因为您没有清楚地描述当间隙行同时具有前后岛时如何进行 - 以及如果它们不一致该怎么办。

让我假设您希望从下一个岛(如果有可用)派生值,并回退到前面的岛。

下面是使用横向连接来检索下一个和前面的非空行的方法:

select t.age, 
coalesce(t.period, n.period - n.diff, p.period - p.diff) period,
coalesce(t.year,   n.year   - n.diff, p.year   - p.diff) year
from mytable t
outer apply (
select top (1) t1.*, t1.age - t.age diff
from mytable t1 
where t1.age > t.age and t1.period is not null and t1.year is not null
order by t1.age 
) n
outer apply (
select top (1) t1.*, t1.age - t.age diff
from mytable t1 
where t1.age < t.age and t1.period is not null and t1.year is not null
order by t1.age desc
) p
order by t.age

实际上,使用窗口函数可能会更有效地执行此操作。我们可以通过构建具有窗口计数的记录组来实现相同的逻辑,然后在组中进行计算:

select
age,
coalesce(
period,
max(period) over(partition by grp2) - max(age) over(partition by grp2) + age,
max(period) over(partition by grp1) - min(age) over(partition by grp1) + age
) period,
coalesce(
year,
max(year) over(partition by grp2) - max(age) over(partition by grp2) + age,
max(year) over(partition by grp1) - min(age) over(partition by grp1) + age
) year
from (
select t.*, 
count(period) over(order by age) grp1, 
count(period) over(order by age desc) grp2
from mytable t
) t
order by age

DB Fiddle 上的演示- 两个查询都会产生:

年龄 | 期间 | 年份 --: |-----: |---: 0 |    25 |2044 1 |    26 |2045 2 |    27 |2046 3 |    28 |2047 4 |    29 |2048 5 |    30 |2049 6 |    31 |2050 7 |    32 |2051 8 |    33 |2052 9 |    34 |2053  10 |    35 |2054  11 |    36 |2055  12 |    37 |2056  13 |    38 |2057  14 |    39 |2058  15 |    40 |2059  16 |    41 |2060  17 |    42 |2061  18 |    43 |2062  19 |    44 |2063  20 |    45 |2064  21 |    46 |2065  22 |    47 |2066  23 |    48 |2067  24 |    49 |2068  25 |    50 |2069  26 |    51 |2070  27 |    52 |2071  28 |    53 |2072  29 |    54 |2073  30 |    55 |2074

您也可以使用递归 CTE(它可以处理表中数据的任何变体,除了一个根本没有填充周期和年份的数据):

WITH cte AS ( -- get any filled period and year
SELECT TOP 1 period - age delta,
[year]-period start_year
FROM tablename
WHERE period is not null and [year] is not null
), seq AS ( --get min and max age values
SELECT MIN(age) as min_age, MAX(age) as max_age
FROM tablename
), go_recursive AS (
SELECT min_age age,
min_age+delta period ,
start_year+min_age+delta year,
max_age
FROM seq
CROSS JOIN cte --That will generate the initial first row
UNION ALL
SELECT age + 1,
period +1,
year + 1,
max_age
FROM go_recursive 
WHERE age < max_age --This part increments the data from first row
)

SELECT age,
period,
[year] 
FROM go_recursive 
OPTION (MAXRECURSION 0) 
-- If you know there are some limit of rows in that kind of tables 
--use this row count instead 0

相关内容

  • 没有找到相关文章

最新更新