我有一个包含我们客户订单的表:order_date:订单日期(这不是唯一的,因为每个订单可能具有多个产品(customer_id:不是唯一的
我想计算每个客户的订单数量到当前订单_date,但是由于order_date中有重复项,结果不合理。
我在Postgres 11.2
中使用窗口功能CREATE TABLE "public"."orders" (
"order_date" timestamp,
"customer_id" integer
);
插入数据:
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-12-13 20:45:24.571964', 402) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-12-13 20:45:24.571964', 402) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-10-12 20:08:39.635959', 466) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-11-04 22:15:14.905851', 483) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id") VALUES('2018-11-04 22:15:14.905851', 483) RETURNING "order_date", "customer_id";
INSERT INTO "public"."orders"("order_date", "customer_id")
我使用此代码生成了我想要的东西,但它不起作用
select *,COALESCE(COUNT(*) OVER (partition by orders.customer_id order by orders.order_date range between interval '100 years' PRECEDING AND
CURRENT ROW EXCLUDE CURRENT ROW),0) AS
customer_orders_count_up_to_now,
COALESCE(COUNT(*) OVER (partition by orders.customer_id order by
orders.order_date asc range BETWEEN interval '7 days' PRECEDING
AND CURRENT ROW EXCLUDE CURRENT ROW),0) AS
customer_orders_last_seven_days
from orders
我期望Customer_orders_count_up_to_now和customer_orders_lastrongeven_days的输出列为0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0但是,由于order_date中的重复。
如果我正确理解,您基本上希望count(distinct)
作为窗口函数。Postgres尚未(尚未(支持它。但是您可以在子查询中使用select distinct
:
select o.*,
COALESCE(COUNT(*) OVER (partition by o.customer_id
order by o.order_date
range between interval '100 years' PRECEDING AND CURRENT ROW EXCLUDE CURRENT ROW),
0) AS customer_orders_count_up_to_now,
COALESCE(COUNT(*) OVER (partition by o.customer_id
order by o.order_date asc
range BETWEEN interval '7 days' PRECEDING AND CURRENT ROW EXCLUDE CURRENT ROW),
0) AS customer_orders_last_seven_days
from (SELECT DISTINCT o.customer_id, o.order_date from orders o) o
我找到了解决方案,如果其他解决方案有相同的问题,我在这里分享:
select *,COALESCE(COUNT(*) OVER (partition by orders.customer_id order by orders.order_date range between interval '100 years' PRECEDING AND
CURRENT ROW EXCLUDE CURRENT ROW),0) - COALESCE(COUNT(*) OVER (partition by orders.customer_id,orders.order_date order by orders.order_date range between interval '100 years' PRECEDING AND
CURRENT ROW EXCLUDE CURRENT ROW),0) AS
customer_orders_count_up_to_now,
COALESCE(COUNT(*) OVER (partition by orders.customer_id order by
orders.order_date asc range BETWEEN interval '7 days' PRECEDING
AND CURRENT ROW EXCLUDE CURRENT ROW),0) - COALESCE(COUNT(*) OVER (partition by orders.customer_id order,orders.order_date by
orders.order_date asc range BETWEEN interval '7 days' PRECEDING
AND CURRENT ROW EXCLUDE CURRENT ROW),0) AS
customer_orders_last_seven_days from orders
的想法是,为了从滚动数中删除重复计数,我们应该从计算的滚动计数中减去在order_time中具有重复的记录的计数。