在bigquery中运行具有多个条件的total



我需要计算一个正在运行的总数,但需要在某个条件下重置总数(当预期达到=0,product_group和产品发生变化时(。在没有两个额外字段的情况下获得了帮助:使用BigQuery中的条件计算运行总数我有这个表,可以使用product_group和product作为整数或字符串,如下所示。

Date, Product_group, Product, Registrations, Expected Registrations, Expected Reached, Running Total
2020-03-01,A, Bikes, 5, 4,1, 1
2020-03-02,A, Bikes, 7, 5,1, 2
2020-03-03,A, Bikes, 8, 6,1, 3
2020-03-04,A, Bikes, 2, 5,0, 0
2020-03-05,A, Bikes, 5, 4,1, 1
2020-03-06,A, Bikes, 7, 5,1, 2 
2020-03-04,B, Cars , 2, 5,0, 0
2020-03-05,B, Cars , 5, 4,1, 1
2020-03-06,B, Cars , 7, 5,1, 2
2020-03-07,B, Cars , 8, 6,1, 3 
2020-03-08,C, Plane, 2, 5,0, 0

任何关于如何调整这个查询的建议(来自另一篇文章的答案(,在没有两个额外字段的情况下运行良好-

#standardSQL
SELECT * EXCEPT(grp), 
SUM(Expected_reached) OVER(PARTITION BY grp ORDER BY `date`) Running_Total
FROM (
SELECT *, COUNTIF(Expected_reached = 0) OVER(ORDER BY `date`) grp 
FROM `project.dataset.table`
)

问题是COUNTIF(Expected_reached = 0) OVER(ORDER BY日期) grp在product_group或product更改时重新开始,并且我得到了非唯一的组,因此运行的总SUM(Expected_reached) OVER(PARTITION BY grp ORDER BY日期) Running_Total无法正确计算。

您只需将PARTITION BY Product_group, Product添加到两个分析函数

#standardSQL
SELECT * EXCEPT(grp), 
SUM(Expected_reached) OVER(PARTITION BY Product_group, Product, grp ORDER BY `date`) Running_Total
FROM (
SELECT *, COUNTIF(Expected_reached = 0) OVER(PARTITION BY Product_group, Product ORDER BY `date`) grp 
FROM `project.dataset.table`
)

就问题而言,您只需要将另外两列product_groupproduct添加到窗口函数的分区子句中:

select 
* except(grp), 
sum(expected_reached) 
over(partition by grp, product_group, product order by `date`) running_total
from (
select 
*, 
countif(expected_reached = 0) 
over(partition by product_group, product order by `date`) grp 
from `project.dataset.table` 
)

最新更新