我有一个数据集,它只是客户每天的订单列表。
order_date | month | 周 | >客户||||
---|---|---|---|---|---|---|
2022-10-06 | 10 | 0Paul | ||||
2022-10-06 | 10 | >td style="text align:central;">40Edward | ||||
2022-101 | 10 | 39 | rick//tr>||||
2022-09-26 | 9 | 39 | 2022-09-23 | 9 | 38 | >爱丽丝|
2022-09-21 | 9 | 38 | >Evelyn
我认为最简单的方法是在子查询中创建自己的grouper,然后使用它来获取计数。目前,不支持窗口中的COUNT UNIQUE和ORDER BY,因此这种方法不起作用。
一个可能的查询可能是:
WITH
week_before AS (
SELECT
EXTRACT(WEEK from order_date) as week, --to be sure this is the same week format
month,
CONCAT(week,'-', EXTRACT(WEEK FROM DATE_SUB(order_date, INTERVAL 7 DAY))) AS two_weeks,
customer
FROM
`test`.`Basic`)
SELECT
two_weeks,
COUNT(DISTINCT customer) AS unique_customer
FROM
week_before
GROUP BY
two_weeks
window
函数是正确的工具。为了获得2周的日期,我们首先提取一年中的周数:
mod(extract(week from order_date),2)
如果周数是奇数(模2(,我们加一周。然后我们缩短到(偶数(周的开始。
date_trunc(date_add(order_date,interval mod(extract(week from order_date),2) week),week )
with tbl as
(Select date("2022-10-06") as order_date, "Paul" as customer
union all select date("2022-10-06"),"Edward"
union all select date("2022-10-01"),"Erick"
union all select date("2022-09-26"),"Divine"
union all select date("2022-09-23"),"Alice"
union all select date("2022-09-21"),"Evelyn"
)
select *,
date_trunc(order_date,month) as month,
date_trunc(order_date,week) as week,
COUNT(DISTINCT customer) OVER week2 as customer_2weeks,
string_agg(cast(order_date as string)) over week2 as list_2weeks,
from tbl
window week2 as (partition by date_trunc(date_add(order_date,interval mod(extract(week from order_date),2) week),week ))
一年中的头几天计入前一年的最后一周:
select order_date,
extract(isoweek from order_date),
date_trunc(date_add(order_date,interval mod(extract(week from order_date),2) week),week)
from
unnest(generate_date_array(date("2021-12-01"),date("2023-01-14"))) order_date
order by 1