我正在尝试区分mysql数据库中的两行
我有一个包含ID、公里数、日期、car_ID、car_driver等的表。
由于我并不总是按正确的顺序在表中输入信息,我可能会得到这样的信息:
ID | Kilometers | date | car_id | car_driver | ...
1 | 100 | 2012-05-04 | 1 | 1
2 | 200 | 2012-05-08 | 1 | 1
3 | 1000 | 2012-05-25 | 1 | 1
4 | 600 | 2012-05-16 | 1 | 1
使用select语句,我可以正确地对表格进行排序:
SELECT * FROM mytable ORDER BY car_driver ASC, car_id ASC, date ASC
我将获得这个:
ID | Kilometers | date | car_id | car_driver | ...
1 | 100 | 2012-05-04 | 1 | 1
2 | 200 | 2012-05-08 | 1 | 1
4 | 600 | 2012-05-16 | 1 | 1
3 | 1000 | 2012-05-25 | 1 | 1
现在我想做一个视图,基本上我有这个额外的信息:自上次日期以来的公里数,我想获得这样的信息:
ID | Kilometers | date | car_id | car_driver | number_km_since_last_date
1 | 100 | 2012-05-04 | 1 | 1 | 0
2 | 200 | 2012-05-08 | 1 | 1 | 100
4 | 600 | 2012-05-16 | 1 | 1 | 400
3 | 1000 | 2012-05-25 | 1 | 1 | 400
我想做一个INNER JOIN来执行我想要的操作,但我觉得我不能在我的ID上进行连接,因为它们没有正确排序
有办法实现我想要的吗?
我应该创建一个具有某种row_number的视图,然后在我的INNER JOIN中使用它吗?
SELECT
mt1.ID,
mt1.Kilometers,
mt1.date,
mt1.Kilometers - IFNULL(mt2.Kilometers, 0) AS number_km_since_last_date
FROM
myTable mt1
LEFT JOIN myTable mt2
ON mt2.Date = (
SELECT MAX(Date)
FROM myTable mt3
WHERE mt3.Date < mt1.Date
)
ORDER BY mt1.date
Sql Fiddle
或者,通过MySql hackness模拟lag()
函数。。。
SET @kilo=0;
SELECT
mt1.ID,
mt1.Kilometers - @kilo AS number_km_since_last_date,
@kilo := mt1.Kilometers Kilometers,
mt1.date
FROM myTable mt1
ORDER BY mt1.date
Sql Fiddle
在Postgres、Oracle和SQL Server 2012中,这非常简单,使用LAG()
函数:
SELECT
id, kilometers, date,
kilometers
- COALESCE( LAG(kilometers) OVER (ORDER BY date ASC, car_driver ASC, id ASC)
, kilometers)
AS number_km_since_last_date
FROM
mytable ;
在MySQL中,我们必须进行一些令人讨厌的构建。内联子查询(可能性能不太好):
SELECT
id, kilometers, date,
kilometers - COALESCE(
( SELECT p.kilometers
FROM mytable AS p
WHERE ( p.date = m.date AND p.car_driver = m.car_driver
AND p.id < m.id
OR p.date = m.date AND p.car_driver < m.car_driver
OR p.date < m.date
)
ORDER BY p.date DESC, p.car_driver DESC
LIMIT 1
), kilometers)
AS number_km_since_last_date
FROM
mytable AS m ;
或自联接(已经由@Michael Fredrickson提供)或使用MySQL变量(也已经提供)。
如果您希望每个car_id
的计数器从0重新开始,这将在许多其他DBMS中使用PARTITION BY
来完成:
SELECT
id, kilometers, date,
kilometers
- COALESCE( LAG(kilometers) OVER (PARTITION BY car_id
ORDER BY date ASC, car_driver ASC, id ASC)
, kilometers)
AS number_km_since_last_date
FROM
mytable ;
它可以在MySQL中这样做:
SELECT
id, kilometers, date,
kilometers - COALESCE(
( SELECT p.kilometers
FROM mytable AS p
WHERE p.car_id = m.car_id
AND ( p.date = m.date AND p.car_driver = m.car_driver
AND p.id < m.id
OR p.date = m.date AND p.car_driver < m.car_driver
OR p.date < m.date
)
ORDER BY p.date DESC, p.car_driver DESC
LIMIT 1
), kilometers)
AS number_km_since_last_date
FROM
mytable AS m ;
这是使用mySQL的窗口函数解决此问题的一种方法:
SELECT
id,
kilometers,
date,
car_id,
car_driver,
COALESCE(kilometers - LAST_VALUE(kilometers) OVER(PARTITION BY car_id, car_driver ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS number_km_since_last_date
FROM mytable
ORDER BY car_driver, car_id, date
由于数据未排序,我只能想到内联子查询(在大表上不是个好主意):
select t1.*,
t1.Kilometers - (select top 1 kilometers from mytable t2 where t2.date < t1.date order by t2.date desc) as number_km_since_last_date
from mytable t1
如果你对数据进行了排序,你可以使用左加入
select t1.*
t1.Kilometers - t2.Kilometers as number_km_since_last_date
from mytable t1
left join mytable t2
on t1.id = t2.id + 1
你可能知道我更喜欢TSQL,所以你可能需要调整MySQL的语法。
这里有一个使用CURSOR和的例子
CREATE TABLE TEMP1
(
MyDate DATETIME,
MyQty INT
)
INSERT INTO TEMP1 VALUES ('01/08/17', 100)
INSERT INTO TEMP1 VALUES ('01/09/17', 120)
INSERT INTO TEMP1 VALUES ('01/10/17', 180)
DECLARE @LastDate DATETIME = NULL
DECLARE @LastQty INT = NULL
DECLARE @MyDate DATETIME = NULL
DECLARE @MyQty INT = NULL
DECLARE mycursor CURSOR FOR
SELECT MyDate, MyQty FROM TEMP1 ORDER BY MyDate
OPEN mycursor
FETCH NEXT FROM mycursor INTO @MyDate, @MyQty
WHILE @@FETCH_STATUS = 0
BEGIN
SELECT @MyDate, @MyQty - @LastQty
SET @LastDate = @MyDate
SET @LastQty = @MyQty
FETCH NEXT FROM mycursor INTO @MyDate, @MyQty
END
CLOSE mycursor
DEALLOCATE mycursor
使用MySQL 8,您可以使用CTE和ROW_NUMBER窗口函数来制作更可读的查询
WITH cte_name AS (
SELECT
ROW_NUMBER() OVER (ORDER BY update_time) as row_num,
id,
other_data,
update_time
FROM table_name WHERE condition = 'some_condition'
)
SELECT t2.id, t2.other_data, TIMEDIFF(t2.update_time, t1.update_time) AS time_taken
FROM
cte_name t1
JOIN cte_name t2 ON t1.row_num = t2.row_num-1
ORDER BY time_taken;
在这个例子中,我试图得到日期时间值之间的差异。
- 这个想法是使用ROW_NUMBER窗口函数在按update_time排序后为每一行分配一个递增编号
- CTE允许我们编写子查询,而不必重复编写相同的代码
- 我们自己加入CTE。连接条件基本上是-每个nᵗ第二个子查询的ʰ项与n-1连接ᵗ第一个子查询的ʰ项(这也意味着第一行将从结果集中消失。如果需要,可以使用UNION将第一行添加到开头)
以下是一些很好的教程:CTE(通用表表达式)、ROW_NUMBER甚至窗口函数