根据一列的前一行计算另一列



我有一个商店和一个产品的以下数据,我需要根据另一列计算一列。

初始数据集:

store product tran_date  audit_date audit_bal inv_value
10001 323232  2020-01-01 null       null      5
10001 323232  2020-01-02 2020-01-02 20        31
10001 323232  2020-01-03 null       null      13
10001 323232  2020-01-04 null       null      6
10001 323232  2020-01-05 null       null      21
10001 323232  2020-01-06 null       null      17
10001 323232  2020-01-07 null       null      6
10001 323232  2020-01-08 null       null      34
10001 323232  2020-01-09 null       null      35
10001 323232  2020-01-10 2020-01-10 60        17
10001 323232  2020-01-12 null       null      6
10001 323232  2020-01-13 null       null      9
10001 323232  2020-01-14 null       null      5
10001 323232  2020-01-15 null       null      29

逻辑:audit_date的下一天的start_stock应为audit_dateaudit_bal对于剩余天数,应为前几天end_stock

audit_date的下一天的end_stock应为audit_date + inv_valueaudit_bal对于剩余天数,应为start_stock(前几天end_stock(+ inv_value

最终数据集应为

store product tran_date  audit_date audit_bal inv_value start_stock end_stock
10001 323232  2020-01-01 null       null      5         6           11  
10001 323232  2020-01-02 2020-01-02 20        31        11          42
10001 323232  2020-01-03 null       null      13        20          33
10001 323232  2020-01-04 null       null      6         33          39
10001 323232  2020-01-05 null       null      21        39          60
10001 323232  2020-01-06 null       null      17        60          77
10001 323232  2020-01-07 null       null      6         77          83 
10001 323232  2020-01-08 null       null      34        83          117
10001 323232  2020-01-09 null       null      35        117         152
10001 323232  2020-01-10 2020-01-10 120       17        152         169
10001 323232  2020-01-12 null       null      6         120         126
10001 323232  2020-01-13 null       null      9         126         135 
10001 323232  2020-01-14 null       null      5         135         140
10001 323232  2020-01-15 null       null      29        140         169

我使用了下面的查询,但没有得到正确的结果

WITH Inv AS (
SELECT      *,case when tran_date = date_add(lag_audit_date,1)   then lag_audit_bal + inv_value  
when date_add(lag_audit_date,1) != tran_date  then lag_audit_bal + inv_value
else SUM(inv_value) OVER (partition by store,product ORDER BY tran_date ASC ROWS UNBOUNDED PRECEDING) end as end_stock 
FROM        
basedata
)
SELECT          tran_date,audit_date,audit_bal,lag_audit_date,lag_audit_bal,
case when tran_date = date_add(lag_audit_date,1)  then lag_audit_bal 
when date_add(lag_audit_date,1) != tran_date  then lag_audit_bal 
else LAG(end_stock,1,0) OVER (partition by store,product ORDER BY transaction_date ASC) end as start_stock, 
inv_val,
end_stock
FROM            Inv;

有人能帮帮我吗?

。。您的查询,略有调整。。

declare @t table (store int, product int, tran_date date, audit_date date, audit_bal int, inv_value int);
insert into @t
values
--store product tran_date  audit_date audit_bal inv_value
(10001, 323232,  '20200101', null,       null,      5),
(10001, 323232,  '20200102', '20200102', 20,        31),
(10001, 323232,  '20200103', null,       null,      13),
(10001, 323232,  '20200104', null,       null,      6),
(10001, 323232,  '20200105', null,       null,      21),
(10001, 323232,  '20200106', null,       null,      17),
(10001, 323232,  '20200107', null,       null,      6),
(10001, 323232,  '20200108', null,       null,      34),
(10001, 323232,  '20200109', null,       null,      35),
(10001, 323232,  '20200110', '20200110', 120,       17),
(10001, 323232,  '20200112', null,       null,      6),
(10001, 323232,  '20200113', null,       null,      9),
(10001, 323232,  '20200114', null,       null,      5),
(10001, 323232,  '20200115', null,       null,      29);
select 
store, product, tran_date, audit_date, audit_bal, inv_value,
--start_stock = end_stock-inv_value 
end_stock-inv_value as start_stock, end_stock
from
(
--calculate end_stock
select *,
max(grp_audit_bal) over(partition by store, product, grp_audit_date order by tran_date)
+ 
sum(inv_value) over(partition by store, product, grp_audit_date order by tran_date) as end_stock
from
(
--groups are defined by latest audit_date 
--also get the audit_bal per group (audit_bal is assigned only to the first member of the group, lag() is used)
select *, 
max(audit_date) over(partition by store, product order by tran_date ROWS between UNBOUNDED PRECEDING and 1 PRECEDING) as grp_audit_date,
lag(audit_bal) over(partition by store, product order by tran_date) as grp_audit_bal
from @t
) as xyz
) as src;

在这个dba问题中,提出了以下逻辑:

https://dba.stackexchange.com/questions/94545/calculate-row-value-based-on-previous-and-actual-row-values

SELECT 
s.stmnt_date, s.debit, s.credit,
SUM(s.debit - s.credit) OVER (ORDER BY s.stmnt_date
ROWS BETWEEN UNBOUNDED PRECEDING
AND CURRENT ROW)
AS balance
FROM
statements AS s
ORDER BY
stmnt_date ;

还要注意关于MySQL分析函数限制(仍然可能?(的额外讨论(如果适用于您的情况(。

该帖子还建议了一个可能的(效率较低的(解决方案,如果您的数据库不支持上述语法(注意:我还没有测试这两种解决方案(:

SELECT 
s.stmnt_date, s.debit, s.credit,
@b := @b + s.debit - s.credit AS balance
FROM
(SELECT @b := 0.0) AS dummy 
CROSS JOIN
statements AS s
ORDER BY
stmnt_date ;

如果你使用MySQL,你可能还会发现感兴趣的LAG((函数

注意本教程中的示例:

https://www.mysqltutorial.org/mysql-window-functions/mysql-lag-function/

https://dev.mysql.com/doc/refman/8.0/en/window-function-descriptions.html#function_lag

SELECT
t, val,
LAG(val)        OVER w AS 'lag',
LEAD(val)       OVER w AS 'lead',
val - LAG(val)  OVER w AS 'lag diff',
val - LEAD(val) OVER w AS 'lead diff'
FROM series
WINDOW w AS (ORDER BY t);

LAG(expr[,N[,default]]([null_ttreatment]over_clause

返回在当前行的分区内落后(先于(当前行N行的行的expr值。如果没有这样的行,则返回值为默认值。例如,如果N为3,则前两行的返回值为默认值。如果缺少N或default,则默认值分别为1和NULL。

N必须是文字上的非负整数。如果N为0,则计算当前行的expr。

从MySQL 8.0.22开始,N不能为NULL。此外,它现在必须是1到263(包括1到263(范围内的整数,具有以下任何形式:

an unsigned integer constant literal
a positional parameter marker (?)
a user-defined variable
a local variable in a stored routine 

相关内容

  • 没有找到相关文章

最新更新