BigQuery row_number删除重复项



我想只保留表中具有最新时间戳的ID,是否有更优化和有效的方法来解决这个问题

a query that I try

SELECT * except(row_number)
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY ID)
row_number
FROM employees 
)
WHERE row_number = 1

员工表:

ID    NAME  DEPARTMENT   UPDATED_AT
1     James IT           2019-05-21 12:13:14
1     James IT           2019-05-21 12:14:14 
1     James IT           2019-05-21 12:18:14
2     Pam   HR           2019-05-26 13:18:14
2     Pam   HR           2019-05-26 14:18:14
3     David IT           2019-06-22 14:18:14
3     David IT           2019-06-23 12:18:14
结果:

ID    NAME  DEPARTMENT   UPDATED_AT
1     James IT           2019-05-21 12:18:14
2     Pam   HR           2019-05-26 14:18:14
3     David IT           2019-06-23 12:18:14

您的子查询语句中缺少ORDER BY子句。

WITH
DATA AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY UPDATED_AT DESC) AS _row,
*
FROM
employees )
SELECT
* EXCEPT(_row)
FROM
DATA
WHERE
_row = 1
SELECT *
FROM employees 
WHERE TRUE
QUALIFY ROW_NUMBER() OVER (PARTITION BY ID ORDER BY UPDATED_AT DESC) = 1

最新更新