我有一个相当复杂的问题,我甚至不确定是否可以单独使用 presto/sql 来完成;所以我不胜感激任何输入。
设置。我有一个订单表...(表1(
order_id | customer_id | order_date | blahblahblah....
--------------------------------------------------
11111 | 5432567 | 2018-12-16 | ..........
10002 | 6543212 | 2019-01-21 | ..........
22222 | 3456788 | 2018-11-09 | ..........
我有另一张表(表2(,
customer_id | customer_rating | as_of_date |
--------------------------------------------
5432567 | A- | 2019-02-04 |
6543212 | B+ | 2019-02-04 |
每天更新,我必须通过指定as_of_date
来调用它,如下所示
selct * from table2
where customer_id="6543212"
and as_of_date='2019-02-04' -- or whatever date
现在问题来了。我想创建一个表,对于table1
中的每个order_id
,选择该customer_id
的customer_rating
,在table1
order_date
之前(例如,当as_of_date
= order_date - 1
时(,以及订单日期之后(假设今天as_of_date =
(。
为了更清楚一点,这是我尝试创建一个模拟表
customer_id | order_id | order_date | customer_rating_before | customer_rating_today
------------------------------------------------------------------------------------
5432567 | 11111 | 2018-12-16 | A+ | A-
6543212 | 10002 | 2019-01-21 | B+ | B+
3456788 | 22222 | 2018-11-09 | C | B
有什么想法吗?!
你可以用一个技巧来做到这一点。 使用 union all
将两个表组合在一起。 然后使用窗口函数获取所有行的上一个和下一个评级日期 - 使用累积min()
和max()
。
有了这些信息,您可以使用另一个窗口函数来获取评级,最后过滤行以仅获取最初在orders
中的行:
select ot.*
from (select ot.*,
max(rating) over (partition by customer_id, prev_rating_date) as prev_rating,
max(rating) over (partition by customer_id, next_rating_date) as next_rating
from (select ot.*,
max(case when rating is not null then order_date end) over (partition by customer_id order by orderdate asc) as prev_rating_date,
min(case when rating is not null then order_date end) over (partition by customer_id order by orderdate desc) as next_rating_date,
from ((select order_id, customer_id, order_date, NULL as rating
from orders
) union all
(select NULL, customer_id, as_of_date, rating
from table2
)
) ot
) ot
) ot
where rating is null;
组织组合表,请删除 * 并单独放置列名。
试试这个
SELECT
table1.customer_id,
table1.order_id,
table1.order_date,
table1.customer_rating as customer_rating_before
table2.customer_rating as customer_rating_after
FROM
table1, table2
WHERE
table1.customer_id = table2.customer_id
这结合了 2 个表,即旧表(表 1(和新表(表2(它将从旧表(表1(创建两列customer_rating,从新表(表2(创建customer_rating列