如何在OVER()窗口中获取相邻值



我有以下数据和查询,以获得从MAX(wins)到当前季节的季节:

WITH results as (
SELECT 'DAL' as team, 2010 as season, 6 as wins union
SELECT 'DET' as team, 2010 as season, 6 as wins union
SELECT 'DET' as team, 2011 as season, 10 as wins union
SELECT 'DET' as team, 2012 as season, 4 as wins union
SELECT 'DET' as team, 2013 as season, 7 as wins union
SELECT 'DET' as team, 2014 as season, 11 as wins
) SELECT team, season, wins
,MAX(wins) OVER (PARTITION BY team ORDER BY season ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) max_wins_thus_far
FROM results;
# team, season, wins, max_wins_thus_far
DAL, 2010, 6, 6
DET, 2010, 6, 6
DET, 2011, 10, 10
DET, 2012, 4, 10
DET, 2013, 7, 10
DET, 2014, 11, 11

这里我们可以看到,例如,对于DET,2011具有10的最大胜利,因此";max_ wins";从2011年到2014年,列为10,此时它取11的较大值。然而,我想在赛季取得最高的总胜利。例如,以下是结果的样子:

# team, season, wins, max_wins_thus_far, season_with_max_wins_thus_far
DAL, 2010, 6, 6, 2010
DET, 2010, 6, 6, 2010
DET, 2011, 10, 10, 2011 <-- 2011 has the most wins for DET
DET, 2012, 4, 10, 2011
DET, 2013, 7, 10, 2011
DET, 2014, 11, 11, 2014 <-- now 2014 is the season with the most wins...

在分析函数中如何做到这一点?我所能做的最好的事情就是用数据构建一个对象,但不确定从哪里开始:

# team, season, wins, max_wins_thus_far
DAL, 2010, 6, {"2010": 6}
DET, 2010, 6, {"2010": 6}
DET, 2011, 10, {"2010": 6, "2011": 10}
DET, 2012, 4, {"2010": 6, "2011": 10, "2012": 4}
DET, 2013, 7, {"2010": 6, "2011": 10, "2012": 4, "2013": 7}
DET, 2014, 11, {"2010": 6, "2011": 10, "2012": 4, "2013": 7, "2014": 11}

您可以使用第二级窗口函数。只要抓住最近的一个赛季,其中的胜利是最大的胜利:

SELECT r.*,
MAX(CASE WHEN wins = max_wins_thus_far THEN season END) OVER (PARTITION BY team ORDER BY season) as max_season
FROM (SELECT team, season, wins,
MAX(wins) OVER (PARTITION BY team ORDER BY season) as max_wins_thus_far
FROM results
) r;

这里有一个db<gt;不停摆弄

我们可以使用一些间隙和孤岛技术:其思想是建立一组"相邻的";具有窗口和的记录,该窗口和在每次满足大于所有先前值的win时递增。然后,我们可以使用窗口min()来恢复相应的季节(基本上,这是每个岛屿的开始(。

select team, season, wins, 
greatest(wins, max_wins_1) max_wins_thus_far,
min(season) over(partition by team, grp order by season) as season_with_max_wins_thus_far
from (
select r.*,
sum(case when wins > max_wins_1 then 1 else 0 end) 
over(partition by team order by season) as grp
from (
select r.*,
max(wins) over (
partition by team 
order by season 
rows between unbounded preceding and 1 preceding
) as max_wins_1
from results r
) r
) r

另一种选择是相应的子查询:

select team, season, wins, 
max(wins) over(partition by team order by season) as max_wins_thus_far,
(
select r1.season
from results r1 
where r1.team = r.team and r1.season <= r.season
order by r1.wins desc, r1.season
limit 1
) as season_with_max_wins_thus_far
from results r

DB Fiddlde上的演示-两个查询都产生:

team|season|wins|max_wins_thus_far|season_with_max_wins_hus_far:---|-----:|----:|------------------:|--------------------:DAL|2010|6|6|20102010年6月6日2011年10月10日DET | 2012 | 4 | 10 | 20112013年11月7日| 2011年10月10日DET | 2014 | 11 | 11 | 2014

这当然是最有效的方法,但考虑到赛季和胜利都是数字,我们可以将它们相加并获得它们的最大值(类似于2024,将201014相加(,然后通过减去到该点的最大值来检索赛季。这里有一个例子:

WITH results as (
SELECT 'DAL' as team, 2010 as season, 6 as wins union
SELECT 'DET' as team, 2010 as season, 6 as wins union
SELECT 'DET' as team, 2011 as season, 10 as wins union
SELECT 'DET' as team, 2012 as season, 4 as wins union
SELECT 'DET' as team, 2013 as season, 7 as wins union
SELECT 'DET' as team, 2014 as season, 11 as wins
) 
SELECT team, season, wins,
max(wins) OVER through_current AS max_wins_thus_far
,max(wins + season) OVER through_current - max(wins) OVER through_current AS season_with_max_wins_thus_far
FROM results
WINDOW through_current AS (PARTITION BY team ORDER BY season ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
# team, season, wins, max_wins_thus_far, season_with_max_wins_thus_far
DAL, 2010, 6, 6, 2010
DET, 2010, 6, 6, 2010
DET, 2011, 10, 10, 2011
DET, 2012, 4, 10, 2011
DET, 2013, 7, 10, 2011
DET, 2014, 11, 11, 2014

另一种方法是按季节&ltcurrent_season and team=团队。例如:

) SELECT *,
(SELECT season FROM results AS r_inner
WHERE r_inner.season <= results.season AND r_inner.team = results.team
ORDER BY WINS DESC LIMIT 1) best_season
FROM results;

最新更新