如何在 select 语句中包含 PERCENTILE_CONT 列而不生成有关 ORDER BY 子句或聚合函数的错误



我需要从一些数据中生成一个特定的报告,并且我在弄清楚正确使用PERCENTILE_CONT来提供我需要的结果时遇到了很多麻烦。 我想在我的查询结果中包含一列,该列显示值范围内的第 95 个百分位数的值。

我有一个表格如下:

customer_id sale_amount sale_date
1   265.75  2019-09-11 00:00:04.000
1   45.75   2019-09-10 01:00:04.000
1   2124.77 2019-09-10 04:00:04.000
1   66.99   2019-09-10 04:20:04.000
1   266.49  2019-09-09 11:20:04.000
1   3266.49 2019-09-08 11:20:04.000

很简单。

我可以运行以下查询,没问题:

select
min(sale_amount) as minimum_sale,
max(sale_amount) as maximum_sale,
avg(sale_amount) as average_sale
from
sales
where
customer_id = 1;

这将产生以下输出:

minimum_sale    maximum_sale    average_sale
45.75           3266.49     1006.040000

我想要的是第四列,perc_95,它将计算代表第 95 个百分位数的值sale_amount。

这可以让我获得价值:

select distinct
customer_id,
percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as perc_95
from
sales;

输出:

customer_id perc_95
1            2981.06

但我似乎无法将它们结合起来 - 这失败了:

select distinct
(customer id),
min(sale_amount) as minimum_sale,
max(sale_amount) as maximum_sale,
avg(sale_amount) as average_sale,
percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as perc_95
from
sales
where
customer_id = 1;

输出:

列"sales.customer_id"在选择列表中无效,因为它未包含在聚合函数或 GROUP BY 子句中。

我大致了解此错误的含义,但我无法弄清楚在这种情况下如何处理它。

我想要的输出:

customer_id     minimum_sale      maximum_sale  average_sale    perc_95
1                   45.75         3266.49  1006.040000     2981.06

使用窗口函数:

select distinct customer_id,
min(sale_amount) over (partition by customer_id) as minimum_sale, 
max(sale_amount) over (partition by customer_id) as maximum_sale,
avg(sale_amount) over (partition by customer_id) as average_sale,
percentile_cont(0.95) within group (order by sale_amount)  over (partition by customer_id) as perc_95
from sales
where customer_id = 1;

SQL Server不支持percentile_cont()等功能作为聚合函数,需要人们使用select distinct进行聚合,这是非常不方便的。

不要使用DISTINCT.

我一开始会尝试这个:

select 
min(customer_id) AS CustomerID, 
min(sale_amount) as minimum_sale, 
max(sale_amount) as maximum_sale,
avg(sale_amount) as average_sale, 
percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as perc_95
from sales 
where customer_id = 1;

如果您收到相同的错误消息,但这次大约是percentile_cont,则将其包装在min函数中:

select 
min(customer_id) AS CustomerID, 
min(sale_amount) as minimum_sale, 
max(sale_amount) as maximum_sale,
avg(sale_amount) as average_sale, 
min(percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id)) as perc_95
from sales 
where customer_id = 1;

另一种方法是将查询拆分为两个单独的 CTE 步骤:一个使用带有窗口函数的SELECT DISTINCT,另一个使用GROUP BY,最终查询将两者JOIN在一起,如下所示:

DECLARE @customerId int = 1;
WITH aggs AS (
SELECT
customer_id,
MIN( sale_amount ) AS minimum_sale,
AVG( sale_amount ) AS mean_sale,
MAX( sale_amount ) AS maximum_sale
FROM
sales
WHERE
customer_id = @customerId
GROUP BY
customer_id
),
wnds AS (
SELECT DISTINCT
customer_id,
percentile_cont(0.05) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as perc_05,
percentile_cont(0.50) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as median,
percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as perc_95
FROM
sales
WHERE
customer_id = @customerId
)
SELECT
ISNULL( a.customer_id, w.customer_id ) AS customer_id,
a.minimum_sale,
p.perc_05,
p.median,
a.mean_sale,
p.perc_95,
a.maximum_sale
FROM
aggs AS a
INNER JOIN wnds AS w ON a.customer_id = w.customer_id

相关内容

  • 没有找到相关文章

最新更新