我有一个付款表,其中有正值和负值(即捕获和贷记)。我需要找出从上次净额为正以来,我们收到净额为正的点。例如,如果客户支付了这些款项并收到了这些信用:
01/01 $100 <-
02/01 -$100
03/01 -$100
04/01 $100
05/01 $100
06/01 $100 <-
…然后点将是01/01和06/01:从02/01到04/01,他们有一个负的余额,到05/01,他们有一个零余额。
我目前的方法首先从所有捕获的日期中构建结束日期列表,然后计算每个日期的开始日期,最后计算这些时期的净捕获:
Start End NetCaptures
1900/01/01 2011/01/01 $100
2011/01/02 2011/04/01 -$100
2011/04/02 2011/05/01 $100
2011/05/02 2011/06/01 $100
然后丢弃netcapture值为0或更少的记录,重新计算开始日期,重新计算net capture,并重复直到没有记录要删除,留下这个。
Start End NetCaptures
1900/01/01 2011/01/01 $100
2011/01/02 2011/06/01 $100
有更好的方法吗?一些分析表达式的巧妙使用?这就接近RBAR了。实际上,它的运行速度是可以接受的(500K条记录10分钟,而在我开始以这种方式计算信用之前是1.5分钟)。
* RESULT *
虽然Microsoft确实支持优雅的滚动总数函数,但使用该思想,我最终编写了这样的代码:计算所有捕获,计算每次捕获之前的运行总数,并丢弃那些具有相等或更大运行总数的较早记录。
CREATE TABLE #Sequences
(
OrderID INT NOT NULL,
Sequence INT NOT NULL,
PRIMARY KEY (OrderID, Sequence),
StartDate DATE NOT NULL DEFAULT '1900-01-01',
EndDate DATE NOT NULL,
CapturesThisPeriod DECIMAL(18, 2) NOT NULL DEFAULT 0.00,
)
INSERT INTO #Sequences (OrderID, Sequence, EndDate)
SELECT OrderID, ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY DateReceived), DateReceived
FROM Receipts
WHERE Amount > 0.00
/* Calculate the start date for each period */
UPDATE S
SET StartDate = DATEADD(D, 1, Prev.EndDate)
FROM
#Sequences AS S
INNER JOIN #Sequences AS Prev ON S.OrderID = Prev.OrderID AND Prev.Sequence = S.Sequence - 1
/* Calculate the cumulative total for each period */
UPDATE M
SET CumulativeReceipts = R.Receipts
FROM
#Sequences AS M
INNER JOIN
(
SELECT
M.OrderID, M.Sequence, SUM(R.Amount) AS Receipts
FROM
#Sequences AS M
INNER JOIN Receipts AS R ON M.OrderID = R.OrderID AND R.DateReceived <= M.EndDate
GROUP BY
M.OrderID, M.Sequence
) AS R ON M.OrderID = R.OrderID AND M.Sequence = R.Sequence
/* Delete sequences with do not represent net positive receipts */
DELETE FROM M
FROM #Sequences AS M
WHERE EXISTS (SELECT * FROM #Sequences AS Prev WHERE M.OrderID = Prev.OrderID AND Prev.Sequence < M.Sequence AND Prev.CumulativeReceipts >= M.CumulativeReceipts)
/* Recalculate sequence numbers and dates */
UPDATE S SET Sequence = NewSequence FROM (SELECT Sequence, ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY Sequence) AS NewSequence FROM #Sequences) AS S
UPDATE S
SET StartDate = DATEADD(D, 1, Prev.EndDate)
FROM
#Sequences AS S
INNER JOIN #Sequences AS Prev ON S.OrderID = Prev.OrderID AND Prev.Sequence = S.Sequence - 1
END
/* Calculate net captures per period, and continue with analysis */
查找"running sum";例如:http://explainextended.com/2010/01/22/sql-server-running-totals