使用时间戳SQL Redshift的最大阶段



我想找到与每个application_id的最大exited_on相关联的记录(下面的示例表(。

我从下面的SQL开始,但收到一条错误消息,告诉我我的子查询有太多列。

SELECT *
FROM application_stages
where application_stages.application_id = '91649746' and 
(application_stages.application_id, max(exited_on) in (select application_stages.application_id, max(exited_on) from application_stages group by application_stages.application_id))

表1

+----------------+-------+--------------------+----------------+------------------+------------------+
| requisition_id | order |     stage_name     | application_id |    entered_on    |    exited_on     |
+----------------+-------+--------------------+----------------+------------------+------------------+
| a              |     0 | Application Review |       91649746 | 6/8/2018 18:27   | 8/28/2018 22:04  |
| a              |     1 | Recruiter Screen   |       91649746 | 6/8/2018 18:27   | 6/21/2018 0:17   |
| a              |     2 | Phone Interview    |       91649746 | 6/21/2018 0:17   | 7/18/2018 12:17  |
| a              |     3 | Assessment         |       91649746 |                  |                  |
| a              |     4 | Interview          |       91649746 |                  |                  |
| a              |     5 | Interview 2        |       91649746 |                  |                  |
| a              |     6 | Interview 3        |       91649746 |                  |                  |
| a              |     7 | Offer              |       91649746 |                  |                  |
| a              |     0 | Application Review |       91991364 | 6/13/2018 14:21  | 6/19/2018 23:56  |
| a              |     1 | Recruiter Screen   |       91991364 | 6/19/2018 23:56  | 9/4/2018 14:01   |
| a              |     2 | Phone Interview    |       91991364 |                  |                  |
| a              |     3 | Assessment         |       91991364 |                  |                  |
| a              |     4 | Interview          |       91991364 |                  |                  |
| a              |     5 | Interview 2        |       91991364 |                  |                  |
| a              |     6 | Interview 3        |       91991364 |                  |                  |
| a              |     7 | Offer              |       91991364 |                  |                  |
| b              |     0 | Application Review |       96444221 | 8/8/2018 16:59   | 8/14/2018 5:42   |
| b              |     1 | Recruiter Screen   |       96444221 | 8/14/2018 5:42   | 10/16/2018 20:02 |
| b              |     2 | Phone Interview    |       96444221 |                  |                  |
| b              |     3 | Interview          |       96444221 | 10/16/2018 20:02 | 10/24/2018 4:27  |
| b              |     4 | Interview 2        |       96444221 | 10/24/2018 4:27  | 11/5/2018 22:38  |
| b              |     5 | Offer              |       96444221 |                  |                  |
+----------------+-------+--------------------+----------------+------------------+------------------+

如前所述,IN仅适用于单列子查询。为了完成你想要的,你可以使用一个连接:

SELECT t1.*
FROM application_stages t1
JOIN (
select application_id, max(exited_on) as exited_on
from application_stages 
group by application_id
) t2 
USING (application_id,exited_on)

不能在"in"运算符中使用多列。更多信息:https://www.w3schools.com/sql/sql_in.asp

如果您正在为应用程序查找最近退出的记录,您可以使用它。

SELECT *
FROM
(
SELECT a.*
, RANK() OVER (PARTITION BY a.application_id ORDER BY exited_on DESC) AS max_exited
FROM application_stages a
)
WHERE max_exited = 1;

这使用了窗口功能。更多信息:https://docs.aws.amazon.com/redshift/latest/dg/r_Examples_of_rank_WF.html

如果你想在使用窗口函数时处理null值,你可以参考以下内容:https://stackoverflow.com/a/22308104/3294216

最新更新