bigquery中其他字段按月份列出的前n个销售供应商的销售额



我在bigquery中有一个这样的表(260000行(:

vendor  date                  item_price    discount_price
x       2021-07-08 23:41:10   451,5         0  
y       2021-06-14 10:22:10   41,7          0
z       2020-01-03 13:41:12   74            4
s       2020-04-12 01:14:58   88            12
....

我想要的正是按月份对这些数据进行分组,然后只找到当月排名前20的供应商的销售额总和。预期输出:

month     vendor_name(top20)  sum_of_vendor's_sales  sum_of_vendor's_discount   item_count(sold)
2020-01   x1                  10857                  250                        150
2020-01   x2                  9685                   410                        50
2020-01   x3                  3574                   140                        45
....
2021 01   x20                 700                    15                         20
2020-02   y1                  7421                   280                        120
2020-02   y2                  6500                   250                        40
2020-02   y3                  4500                   200                        70
.....
2020-02   y20                 900                    70                         30

我试过这个(这里有消息来源(。但无法获得所需的输出。

select month, 
(select sum(sum) from t.top_20_vendors) as sum_of_only_top20_vendor_sales
from (
select 
format_datetime('%Y%m', date) month, 
approx_top_sum(vendor, item_price, 20) top_20_vendors,count(item_price) as count_of_items,sum(discount_price)
from my_table
group by month
) t

考虑以下方法

select 
format_datetime('%Y%m', date) month, 
vendor as vendor_name_top20, 
sum(item_price) as sum_of_vendor_sales,
sum(discount_price) as sum_of_vendor_discount,
count(*) as item_count_sold
from your_table
group by vendor, month
qualify row_number() over(partition by month order by sum_of_vendor_sales desc) <= 20

最新更新