我们正在跟踪我们的运输合作伙伴通过他们触发的webhook发送的订单状态。webhook每次触发时都会添加一行,因此每个订单都有多行关联
表格结构在此处输入图像描述
我们正在尝试创建一个sql查询来获得以下
查找"awb"的最后一行。获取该行中的current_status。如果current_status
是"PICKUP EXCEPTION
"、"OUT FOR PICKUP
"one_answers"PICKUP RESCHEDULED
"中的任何一个,则为该特定"awb"查找第一次出现这些状态的行检查awb的这些状态第一次出现和最后一次出现之间的天数并输出相差2天以上的awb。
这是我能够创建的查询。
WITH ranked_order_status AS (
SELECT os.*,
datediff(
now() ,
first_value(recived_at) over (partition by awb order by recived_at asc)
) as diff,
ROW_NUMBER() OVER (PARTITION BY awb ORDER BY recived_at desc) AS rn
FROM order_status AS os where current_status in ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
)
SELECT * FROM ranked_order_status WHERE rn = 1 and diff > 2
不幸的是,这向我显示了所有具有这些状态的行的awb,而不仅仅是最后一次接收到的当前状态为"PICKUP EXCEPTION
"、"OUT FOR PICKUP
"one_answers"PICKUP RESCHEDULED
"的awb
知道我该怎么编辑吗?
因此,如果我理解正确,这应该是使用RANK((的分析函数的一个明显例子。
这将是我关于你提到的限制的方法:
WITH t1 AS (
SELECT os.*,
FIRST_VALUE(os.received_at) OVER(PARTITION BY os.awb
ORDER BY os.received_at) AS first_received_at
FROM order_status AS os
WHERE os.current_status IN ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
),
t2 AS (SELECT RANK() OVER (PARTITION BY t1.awb
ORDER BY t1.received_at DESC) AS reverse_event_sequence,
DATE_DIFF(t1.received_at, t1.first_received_at, DAY) AS day_diff
t1.*
FROM t1
),
final AS (
SELECT *
FROM t2
WHERE t2.day_diff > 2
AND t2.reverse_event_sequence = 1
)
SELECT *
FROM final
基本上,你想首先获取每行received_at的第一个值,然后你想对每个awb的所有事件进行排序,并按降序排列,使最后一个事件始终为rank=1,然后你对日期差应用所需的约束:(
不过,我必须指出,没有数据样本也无济于事。如果对我的方法有任何反馈,我将不胜感激:(
您可以通过两种方式从最近的行中枚举行:
- 按awb划分,按received降序排列
- 按awb和status划分,按received降序排列
对于每个awb的最后一个状态,它们之间的差异将为零。您可以选择此项,然后聚合:
select awb, current_status,
min(received_at), max(received_at)
from (select os.*,
row_number() over (partition by awb order by received_at desc) as seqnum,
row_number() over (partition by awb, current_status order by received_at desc) as seqnum_2
from order_status os
) os
where seqnum = seqnum_2 and
current_status in ('PICKUP EXCEPTION', 'OUT FOR PICKUP', 'PICKUP RESCHEDULED')
group by awb, current_status
having max(received_at) > min(received_at) + interval 2 day;