我正在尝试获取表中一组值的平均值、中位数、模式和范围。我能够得到平均值,但中位数,范围和模式我得到了一个错误的。
下面是我为上述概念尝试的代码。
Select
CDS.[Commodity_SourceSeriesID_LongDesc] AS 'Description',
TD.TimeDimension_Year AS 'Year',
AVG(DV.DataValues_AttributeValue) AS 'Average/Mean',
MAX(dv.DataValues_AttributeValue) AS 'Maximum value for the Year',
MIN(dv.DataValues_AttributeValue) AS 'Minimum value for the Year',
((MAX(dv.DataValues_AttributeValue) + MIN(dv.DataValues_AttributeValue)) / 2) AS 'Median',
--,(SELECT TOP 1 with ties DataValues_AttributeValue
--FROM [CoSD].[DataValues]
--WHERE DataValues_AttributeValue IS Not NULL AND DataValues_ERSCommodity_ID = 157 and DataValues_DataRowLifecyclePhaseID = 1
--GROUP BY DataValues_AttributeValue
--ORDER BY COUNT(*) DESC) AS Mode
(MAX(dv.DataValues_AttributeValue) - MIN(dv.DataValues_AttributeValue)) AS 'Range'
FROM
[CoSD].[DataValues] DV
INNER JOIN
[CoSD].[CommodityDataSeries] CDS ON CDS.Commodity_ID = DV.DataValues_Commodity_ID
INNER JOIN
[CoSD].[TimeDimension_LU] TD ON TD.TimeDimension_ID = DV.DataValues_TimeDimension_ID
WHERE
DataValues_Commodity_ID = 157
AND DataValues_DataRowLifecyclePhaseID IN (1, 4)
GROUP BY
DV.DataValues_TimeDimension_ID,
CDS.Commodity_SourceSeriesID_LongDesc,
TD.TimeDimension_Year
有没有办法实现这一目标?
谢谢
在 SQL 2012 或更高版本中,使用 percentile_cont
函数计算中位数通常更容易。 看起来您的其余问题已经得到解决,但我想您也想知道此选项。
https://msdn.microsoft.com/en-us/library/hh231473.aspx
不确定这是否有帮助,但这里有一些 sql,它允许我在组中生成一些统计数据(...,平均值、中位数、模式,..)
- cteBase 将是您的核心数据(非聚合或分组)
- cteMedian 将生成 cteBase 的中位数
- cteMode 将计算 cteBase 的模式
我只计算了一个度量,但我怀疑它可以很容易地扩展在我有"GrpByYear"的地方,这必须扩展到您的复合字段中。
;with cteBase as (
Select RowNr=Row_Number() over (Partition By Year(TR_Date) Order By Year(TR_Date),TR_Y10)
,GrpByYear = Year(TR_Date)
,Measure = TR_Y10
From [Chinrus-Series].[dbo].[DS_Treasury_Rates]
Where Year(TR_Date)>2014
)
,cteMedian as (Select A.GrpByYear,Measure From cteBase A Join (Select GrpByYear,RowNr=Max(RowNr)/2 from cteBase Group by GrpByYear) B on (A.GrpByYear=B.GrpByYear and A.RowNr=B.RowNr))
,cteMode as (Select * from (Select RowNr=Row_Number() over (Partition By GrpByYear Order by Count(*) Desc),GrpByYear,Measure,Hits=count(*) From cteBase Group by GrpByYear,Measure) A Where RowNr=1)
Select A.GrpByYear
,RecordCount = Count(*)
,DistinctCount = Count(Distinct A.Measure)
,SumTotal = Sum(A.Measure)
,Minimum = Min(A.Measure)
,Maximum = Max(A.Measure)
,Mean = Avg(A.Measure)
,Median = Max(B.Measure)
,Mode = Max(C.Measure)
,StdDev = STDEV(A.Measure)
From cteBase A
Join cteMedian B on A.GrpByYear=B.GrpByYear
Join cteMode C on A.GrpByYear=C.GrpByYear
Group By A.GrpByYear
Order By A.GrpByYear
Year RecordCount DistinctCount SumTotal Minimum Maximum Mean Median Mode StdDev
2016 110 43 204.82 1.63 2.25 1.862 1.84 1.83 0.128568690811108
2015 251 69 536.71 1.68 2.50 2.1382 2.16 2.20 0.1662836533952
你可能更想做这样的事情:
select dbo.Median(DataValues_AttributeValue)
from ...
没有巧妙的方法可以像使用本机聚合(如 avg、max、min、max 等)那样获取中位数或模式。但是,您可能希望尝试使用 .NET CLR 聚合实现,例如,如果您想要一些优雅的东西,例如,例如,在 C# 中实现中位数和模式,例如上面的代码片段。
这是我过去所做的。