排除 agg() 中的行(x 前面的行)如果它们与当前行相关?



我正在计算特定商品最近 100 笔销售的移动平均值。我想知道用户 X 在过去 5 个销售窗口中在该商品上花费的 100 倍以上。

--how much has the current row user spent on this item over the last 100 sales?
SUM(saleprice) OVER(PARTITION BY item, user ORDER BY saledate ROWS BETWEEN 100 PRECEDING AND CURRENT ROW)
--pseudocode: how much has everyone else, excluding this user, spent on that item over the last 100 sales?
SUM(saleprice) OVER(PARTITION BY item ORDER BY saledate ROWS BETWEEN 100 PRECEDING AND CURRENT ROW WHERE preceding_row.user <> current_row.ruser)

最终,我不希望我的大消费者的购买被计入小消费者的总支出中。是否有一种技术可以从窗口中排除行,如果它们不符合与当前行的某些比较条件?(就我而言,如果前一行的销售价格与当前行的用户相同,则不要对前一行的销售价格求和(

第一个对我来说看起来不错,除了你数了 101 个销售额。(前面 100 个和当前行(

--how much has the current row user spent on this item over the last 100 sales?
SUM(saleprice)
OVER (
PARTITION BY item, user
ORDER BY saledate
ROWS BETWEEN 100 PRECEDING AND 1 PRECEDING   -- 100 excluding this sale
ROWS BETWEEN  99 PRECEDING AND CURRENT ROW   -- 100 including this sale
)

(只需使用建议的两个ROWS BETWEEN条款之一(


在第二个表达式中,不能添加WHERE子句。 您可以更改聚合,分区和排序,但我看不出这将如何帮助您。 我认为您需要一个相关的子查询和/或使用OUTER APPLY......

SELECT
*,
SUM(saleprice)
OVER (
PARTITION BY item, user
ORDER BY saledate
ROWS BETWEEN  99 PRECEDING AND CURRENT ROW   -- 100 including this sale
)
AS user_total_100_purchases_to_date,
others_sales_top_100_total.sale_price
FROM
sales_data
OUTER APPLY
(
SELECT
SUM(saleprice)  AS saleprice
FROM
(
SELECT TOP(100) saleprice
FROM sales_data       others_sales
WHERE others_sales.user     <> sales_data.user
AND others_sales.item      = sales_data.item
AND others_sales.saledate <= sales_data.saledate
ORDER BY others_sales.saledate DESC
)
AS others_sales_top_100
)
AS others_sales_top_100_total


编辑:另一种看待它的方式,使事情保持一致

SELECT
*,
usr_last100_saletotal,
all_last100_saletotal,
CASE WHEN usr_last100_saletotal > all_last100_saletotal * 0.8
THEN 'user spent 80%, or more, of last 100 sales'
ELSE 'user spent under 80% of last 100 sales'
END
AS 
FROM
sales_data
OUTER APPLY
(
SELECT
SUM(CASE top100.user WHEN sales_data.user THEN top100.saleprice END)   AS usr_last100_saletotal,
SUM(                                           top100.saleprice    )   AS all_last100_saletotal
FROM
(
SELECT TOP(100) user, saleprice
FROM sales_data       AS others_sales
WHERE others_sales.item      = sales_data.item
AND others_sales.saledate <= sales_data.saledate
ORDER BY others_sales.saledate DESC
)
AS top100
)
AS top100_summary


最新更新