在mySQL中查找基于不同id的时差



我想为每个用户找到最新和第二个最新的不同order_id之间的日期差异。

预期输出为:

user_id | order_diff
1    |     1
3    |     7
8    |     1

CCD_ 2表示两个不同的CCD_ 3之间的天数差异。如果没有两个不同的order_id(如用户id 9的情况(,则不返回结果。

在这种情况下,user_id1order_diff是1,因为他的两个不同的order_id之间的日差是1。然而,user_id9没有order_diff,因为他没有2个不同的"order_id"。

这是数据集:

user_id order_id    order_time
1       208965785   2016-12-15 17:14:13
1       201765785   2016-12-14 17:19:05
1       203932785   2016-12-13 20:41:30
1       209612785   2016-12-14 20:14:32
1       208112785   2016-12-14 20:27:08
1       205525785   2016-12-14 17:01:26
1       208812785   2016-12-14 20:18:23
1       206432785   2016-12-11 20:32:20
1       206698785   2016-12-14 10:50:15
2       209524795   2016-11-26 18:06:21
3       206529925   2016-10-01 10:43:57
3       203729925   2016-10-08 10:43:11
4       204876145   2016-09-24 10:23:49
5       203363157   2016-07-13 23:56:43
6       207784875   2017-01-04 12:21:21
7       206437177   2016-06-25 02:40:33
8       202819645   2016-09-09 11:47:27
8       202819645   2016-09-09 11:47:27
8       202819646   2016-09-08 11:47:27
9       205127187   2016-06-05 22:21:18
9       205127187   2016-06-05 22:21:18
11      207874877   2016-06-17 16:49:44
12      204927595   2016-11-28 23:05:40

这是我目前使用的代码:

SELECT e1.user_id,datediff(e1.order_time,e2.time), e1.order_id FROM
sales e1
JOIN
sales e2
ON
e1.user_id=e2.user_id
AND
e1.order_id = (SELECT distinct order_id FROM sales temp1 WHERE temp1.order_id =e1.order_id ORDER BY order_time DESC LIMIT 1)
AND
e2.order_id = (SELECT distinct order_id FROM sales temp2 WHERE temp2.order_id=e2.order_id ORDER BY order_time DESC LIMIT 1 OFFSET 1)

我的输出没有产生所需的输出,它也忽略了order_ids相同的情况。

编辑:我还希望将查询扩展到较大的数据集,其中第二个最近的order_time可能不是min(order_time)

基于小提琴:

select user_id, 
datediff(max(order_time), 
( -- Scalar Subquery to get the 2nd largest order_time
select max(order_time)
from orders as o2
where o2.user_id = o.user_id              -- same user
and o2.order_time < max(o.order_time)   -- but not the max time
)
) as diff
from orders as o
group by user_id
having diff is not null -- if there's no 2nd largest time diff will be NULL

以下将起作用:

架构(MySQL v5.7(

CREATE TABLE orders
(`user_id` int, `order_id` int, `order_time` datetime)
;
INSERT INTO orders
(`user_id`, `order_id`, `order_time`)
VALUES
(1,208965785,'2016-12-15 17:14:13'),
(1,201765785,'2016-12-14 17:19:05'),
(1,203932785,'2016-12-13 20:41:30'),
(1,209612785,'2016-12-14 20:14:32'),
(1,208112785,'2016-12-14 20:27:08'),
(1,205525785,'2016-12-14 17:01:26'),
(1,208812785,'2016-12-14 20:18:23'),
(1,206432785,'2016-12-11 20:32:20'),
(1,206698785,'2016-12-14 10:50:15'),
(2,209524795,'2016-11-26 18:06:21'),
(3,206529925,'2016-10-01 10:43:57'),
(3,203729925,'2016-10-08 10:43:11'),
(4,204876145,'2016-09-24 10:23:49'),
(5,203363157,'2016-07-13 23:56:43'),
(6,207784875,'2017-01-04 12:21:21'),
(7,206437177,'2016-06-25 02:40:33'),
(8,202819645,'2016-09-09 11:47:27'),
(8,202819645,'2016-09-09 11:47:27'),
(8,202819646,'2016-09-08 11:47:27'),
(9,205127187,'2016-06-05 22:21:18'),
(9,205127187,'2016-06-05 22:21:18'),
(11,207874877,'2016-06-17 16:49:44'),
(12,204927595,'2016-11-28 23:05:40');

查询#1

SELECT dt2.user_id, 
MIN(datediff(dt2.latest_order_time, 
dt2.second_latest_order_time)) AS order_diff 
FROM (
SELECT o.user_id, 
o.order_time AS latest_order_time,  
(SELECT o2.order_time 
FROM orders AS o2 
WHERE o2.user_id = o.user_id AND 
o2.order_id <> o.order_id 
ORDER BY o2.order_time DESC LIMIT 1) AS  second_latest_order_time 
FROM orders AS o 
JOIN (SELECT user_id, MAX(order_time) AS latest_order_time 
FROM orders 
GROUP BY user_id) AS dt 
ON dt.user_id = o.user_id AND 
dt.latest_order_time = o.order_time 
) AS dt2 
WHERE dt2.second_latest_order_time IS NOT NULL 
GROUP BY dt2.user_id;
| user_id | order_diff |
| ------- | ---------- |
| 1       | 1          |
| 3       | 7          |
| 8       | 1          |

查看DB Fiddle


详细信息:

  • 我们为子选择查询(派生表(中的user_id确定最大order_time。我们可以将其别名为latest_order_time
  • 我们把这个结果集合Join放到orders表中。这将有助于我们仅考虑order_diff0的最大值为order_time的行
  • 现在,我们使用CorrelatedSubquery来确定同一用户的最大order_time值,而不是剩余的order_id值。我们可以将其别名为second_latest_order_time
  • 最后,再次将其用作派生表,并删除second_latest_order_timenull的所有情况,并计算其余情况的datediff()
  • 需要最终的Group By,因为您的数据有多个条目用于

以下是解决方案:

SELECT user_id, 
DATEDIFF(MAX(order_time), MIN(order_time)) as order_diff
FROM orders
GROUP BY user_id
HAVING order_diff > 0;

这里有一个测试它的链接。

最新更新