想要编写一个SQL Server查询来查找给定数字频率的中值。
表:
+----------+-------------+
| number | frequency |
+----------+-------------|
| 2 | 7 |
| 3 | 1 |
| 5 | 3 |
| 7 | 1 |
+----------+-------------+
在该表中,数字为2, 2, 2, 2, 2, 2, 2, 3, 5, 5, 5, 7
,因此中值为(2 + 2) / 2 = 2
我已经使用递归CTE sqlfiddle创建了以下查询,有没有更好的方法可以更高效地编写这个查询?
我的解决方案:
/* recursive CTE to generate numbers with given frequency */
with nums as
(
select
number,
frequency
from numbers
union all
select
number,
frequency - 1
from nums
where frequency > 1
)
/* find median */
select
round(avg(number * 1.0), 2) as median
from
(
select
number,
count(*) over () as ttl,
row_number() over (order by number) as rnk
from nums
) t
where rnk = (case when ttl%2 != 0 then (ttl/2) else (ttl/2)+1 end)
or rnk = (case when ttl%2 = 0 then (ttl/2)+1 end)
只需进行累积求和并取中间值。我认为这就是逻辑:
select avg(number)
from (select t.*,
sum(freq) over (order by number asc) as sum_freq,
sum(freq) over () as cnt
from t
) t
where cnt <= 2 * sum_freq and
cnt >= 2 * (sum_freq - freq);
这是一个数据库<>不停摆弄
尽管如此,现在回复可能为时已晚。以下是我从给定频率表中查找中值的方法。这涵盖了奇数和偶数场景。
`-- 1. Create a frequency table witth the Price and its frequency
IF OBJECT_ID('tempdb.dbo.#MedianEX', 'U') IS NOT NULL
DROP TABLE #MedianEX;
create table #MedianEX (Price int,Frequency int)
insert into #MedianEX values(2 ,3)
insert into #MedianEX values(3,7)
insert into #MedianEX values(4,2)
insert into #MedianEX values(6,5 )
insert into #MedianEX values(5,6 )
insert into #MedianEX values(7,2 )
--uncomment the below statement to make total number of element = 24(even number scenario)
update #MedianEx set Frequency=5 where Price=5
--calculate (n-1)/2 th position and (n+1)/2 th position if odd value and n/2 th position if even
IF OBJECT_ID('tempdb.dbo.#temp', 'U') IS NOT NULL
DROP TABLE #temp;
select *,SUM(Frequency)over (order by Price asc)as cmfreq,ceiling(SUM(Frequency) over()/2.0) as UCNT,
floor(SUM(Frequency) over()/2.0) as LCNT into #temp from #MedianEX
--select price value at the required positions and calculate their average
select cast(avg(Price * 1.0) as decimal(10,2)) from (
select top 1 Price from #temp where UCNT<=cmfreq
union
select top 1 Price from #temp where LCNT<=cmfreq) t
--select * from #MedianEX`