如何只选择在给定的所有日期范围内存在的那些记录



我想只选择所有具有重复数据计数的日期的数据。

示例我的表格数据是:

user_id row_created
8SRWS3hMR 2020-12-14 00:13:31
8SRWS3hMR 2020-12-14 00:35:06
8SRWS3hMR 2020-12-14 12:11:37
8SRWS3hMR 2020-12-14 13:16:27
8SRWS3hMR 2020-12-14 16:30:00
8SRWS3hMR 2020-12-14 19:25:11
8SRWS3hMR 2020-12-14 19:27:07
8SRWS3hMR 2020-12-15 17:14:06
8SRWS3hMR 2020-12-16 14:53:54

您可以按如下方式使用HAVING子句:

SELECT USER_ID,
COUNT(USER_ID) AS TOTAL,
ROW_CREATED
FROM REWARD
WHERE USER_ID = '8SRWS3hMR'
AND DATE(ROW_CREATED) BETWEEN '2020-12-14' AND '2020-12-17'
GROUP BY DATE(ROW_CREATED) 
HAVING COUNT(DISTINCT DATE(ROW_CREATED)) = datediff('2020-12-17', '2020-12-14') + 1;

-更新

SELECT * FROM
(SELECT USER_ID,
COUNT(USER_ID) AS TOTAL,
ROW_CREATED,
COUNT(DISTINCT DATE(ROW_CREATED)) OVER (PARTITION BY USER_ID) AS CNT
FROM REWARD
WHERE USER_ID = '8SRWS3hMR'
AND DATE(ROW_CREATED) BETWEEN '2020-12-14' AND '2020-12-17'
GROUP BY USER_ID, DATE(ROW_CREATED) 
) T WHERE CNT = datediff('2020-12-17', '2020-12-14') + 1

这有点复杂,因为您需要按日期进行摘要。您可以使用窗口功能:

SELECT r.*
FROM (SELECT USER_ID, DATE(ROW_CREATED) as date, COUNT(*) AS TOTAL,
MIN(ROW_CREATED) as ROW_CREATED,
COUNT(*) OVER (PARTITION BY USER_ID) as num_days,
DATEDIFF(x.end_date, x.start_date) + 1 AS total_days
FROM REWARD R CROSS JOIN
(SELECT DATE('2020-12-14') as START_DATE, DATE('2020-12-17') as END_DATE
) params
WHERE USER_ID = '8SRWS3hMR' AND
ROW_CREATED >= x.start_date AND
ROW_CREATE <= x.end_date + interval 1 day
GROUP BY DATE(ROW_CREATED) 
) R
WHERE num_days = total_days;

子查询按天汇总。它包括两个计数:

  • USER_ID的不同天数
  • 总天数

为了避免多次输入相同的日期,这些日期在子查询中定义。还要注意,日期比较是重新安排的。不使用DATE()函数,而是使用不等式。这使表达式与索引更加兼容。

选择user_id,count(user_id(作为总数,row_created from reward其中user_id='8SRWS3hMR'AND DATE(row_created(>='2020-12-14'AND DATE(row_created(<='2020-12-17'按日期分组(row_created(具有count(user_id(>=DATEDIFF(已创建行,"2020-12-17"(

最新更新