Redshift/SQL中前20百分位的计算



我试图根据页面浏览量提取销售产品的前20个百分位数。红移有什么直接的函数吗?

需要建议

销售查询:

SELECT product_category,
product_id,
sum(page_views) as page_views,
sum(units) as total_units

FROM order o
JOIN product p 
ON p.product_id=o.product_id
JOIN page_views f
ON f.product_id=o.product_id
WHERE 
extract (year from order_date) =2019  
and extract (month from order_date) =12 
and product_category='electronics'
GROUP BY 
1,2,3

有一个PERCENT_RANK窗口函数,你可以在sum(page_views)上运行,它会告诉你排名的百分比。然后你可以过滤pct_rnk <= 0.2。我认为您需要使用CTE来进行过滤—您不能过滤窗口函数

的结果。类似于(我没有测试):

WITH ranked_products as (
SELECT product_category,
product_id,
PERCENT_RANK() OVER (ORDER BY sum(page_views) desc) as pct_rnk

FROM order o
JOIN product p 
ON p.product_id=o.product_id
JOIN page_views f
ON f.product_id=o.product_id
WHERE 
extract (year from order_date) =2019  
and extract (month from order_date) =12 
and product_category='electronics'
GROUP BY 1, 2 )
SELECT * FROM ranked_products
WHERE pct_rnk <= 0.2

最新更新