"Get the first row of each day" SQL 查询



我是SQL Server数据库的最终用户,该数据库具有按日期排序的多行。让我们以这个数据库为例:

金额 日期
23.5 20210512010220111
24 20210512020220111
30 2021051203021
1.2 20210513010111
1000 2021051301
24 20210514100220111
240 20210514100220111

尝试影响

SELECT t.yyyy_mm_dd_date, table.*
FROM table
JOIN (
SELECT SUBSTRING (Date, 1, 8) as yyyy_mm_dd_date, MIN(Date) as min_date
FROM table
) t
ON t.min_date = table.Date

一般来说,我发现SQL查询在使用联接和聚合时运行得很快(尤其是在它们的索引上(,所以如果可以将查询转换为使用这些查询,我认为通常应该运行相当快的

以下查询可能比原始查询性能更好,因为它们使用存在性检查(或top子句(,而不是读取所有数据,计算所有行的row_number,然后扫描结果。如果[Date]列上已经有索引,它将表现得最好。

由于有重复的[Date]值,除非查询中包含唯一的键列,否则每次执行查询时都可能返回不同的结果。

create table  #t(
[Id] int
,[Amount] decimal(10,2)
,[Date] bigint
);
create index idx_Date on #t([Date]);
insert #t values
(1, 23.5, 20210512010220111)
,(2, 24, 20210512020220111)
,(3, 30, 20210512030220111)
,(4, 1.2, 20210513011020111)
,(5, 1000, 20210513020220111)
,(6, 24, 20210514100220111)
,(7, 240, 20210514100220111);

-- Assuming that you have a unique key available
select
*
,t1.[Date]/1000000000
from #t t1
where not exists (
select *
from #t t2
where t1.[Date]/1000000000 = t2.[Date]/1000000000
and (
t2.[Date] < t1.[Date]
or (
t2.[Date] = t1.[Date]
and t2.Id < t1.Id
)
)
);
--This is a kludge if you don't have a unique key available and may perform worse than your original query. Don't use this without testing it in a non production system first.
select
*
,t1.[Date]/1000000000
from #t t1
where not exists (
select *
from #t t2
where t1.[Date]/1000000000 = t2.[Date]/1000000000
and (
t2.[Date] < t1.[Date]
or (
t2.[Date] = t1.[Date]
and t2.%%physloc%% < t1.%%physloc%% -- %%physloc%% is the File/Page/Slot for the row
)
)
);
-- Alternatively using top. Assumes a unique column is available
select
t1.*
,t1.[Date]/1000000000
from #t t1
cross apply (
select top 1 *
from #t t2
where t1.[Date]/1000000000 = t2.[Date]/1000000000
order by Date, Id
) t2
where t1.Id = t2.Id
drop table #t;

感觉你可以在这里使用cte,并将日期字符串转换为实际的date值(最后一个查询(这是我使用测试数据的示例(你没有发布列类型,所以我猜是这样(

并非100%清楚";日期";列,如果它是一个实际的日期时间,您可以直接转换为日期

DECLARE @mytable TABLE (
Amount NUMERIC(10,2) NOT NULL,  
[Date] VARCHAR(30) NOT NULL
);
INSERT INTO @mytable(Amount,[Date])
VALUES
(3.5, '20210512010220111'),
(24.0,  '20210512020220111'),
(30.0,'20210512030220111'),
(1.2,   '20210513011020111'),
(1000.0,    '20210513020220111'),
(24.0,  '20210514100220111'),
(240.0, '20210514100220111')
;
SELECT 
[Amount], 
MAX(CAST( LEFT([Date], 8)  AS DATE)) AS NewDate
FROM @mytable
GROUP BY AMOUNT 
ORDER BY MAX(CAST( LEFT([Date], 8)  AS DATE)) DESC;

/* this is what we want perhaps: */
;
WITH cte AS (
SELECT
Amount,
CAST(LEFT([Date], 8) AS DATE) AS MyDate,
ROW_NUMBER() OVER(PARTITION BY CAST(LEFT([Date], 8) AS DATE) ORDER BY CAST(LEFT([Date], 8) AS DATE) DESC) AS row_number
FROM @mytable
)
SELECT
*
FROM cte
WHERE row_number = 1;

最新更新