SQL:直接从联接中获取top元素



我很有兴趣了解我们是否可以避免内部查询,并在联接之前使用联接来获取另一个表的顶部元素。

例如,举一个例子;学生";另一个表表示";student_ marks"学生在个别科目上的得分。现在,我想列出每个学生在所有科目中的最高分数。

student
-----------------------------
| student_id | name   | Age |
-----------------------------
| S1         | Biden  | 15  |
| S2         | Jordan | 16  |
-----------------------------
student_marks
-------------------------------------
| student_id | subject      | marks |
-------------------------------------
| S1         | Geology      | 80    |
| S1         | Trigonometry | 90    |
| S2         | Geography    | 70    |
| S2         | Geology      | 75    |
-------------------------------------

期望的结果如下:

----------------------------------------------------
| student_id | name   | age | subject      | marks |
----------------------------------------------------
| S1         | Biden  | 15  | Trigonometry | 90    |
| S2         | Jordan | 16  | Geology      | 75    |
----------------------------------------------------

由于我使用MySql,我不能与子句一起使用

我的尝试是

select * from
student s
inner join (
select student_id, max(marks) as marks from student_marks group by student_id
) max_student_marks sm
on s.student_id = sm.student_id;

PS:上面的方法确实给出了所需的结果(没有主题列(,但随着数据量的增加,运行速度非常慢。因此,是否有更好的方法来利用联接,并从用于联接的辅助表中选择顶部元素。

您可以使用row_number():

select *
from student s inner join 
(select sm.*,
row_number() over (partition by student_id order by marks desc) as seqnum
from student_marks sm
)
on s.student_id = sm.student_id and sm.seqnum = 1;

你几乎做到了:

select * from
student s
inner join student_marks sm1
on s.student_id = sm1.student_id
inner join (
select student_id, max(marks) as marks from student_marks group by student_id
) sm2
on sm1.student_id = sm2.student_id and sm1.marks = sm2.marks;

然后,您可以使用sm1从具有最大标记的相应行中获取其他列。

请注意,如果student_marks中有多行与标记的最大值匹配,则可能会发现关联。

这就是窗口函数如此有用的原因。

我在MySQL 8.0之前使用的另一个解决方案涉及";"平局决胜局";柱在与最大标记匹配的行集中,任何保证不同的列都可以工作,但主键是典型的选择。假设该表的主键是id

select * from
student s
inner join student_marks sm1 on s.student_id = sm1.student_id
left outer join student_marks sm2 on s.student_id = sm2.student_id 
and (sm1.marks < sm2.marks or sm1.marks = sm2.marks and sm1.id < sm2.id)
where sm2.student_id IS NULL;

其工作方式是检查是否存在具有更大标记的行sm2,或者如果不存在更大标记,则id更大。如果不存在这样的行,则sm1必须是具有最大标记的行。当OUTER JOIN为sm2的列返回NULL时,就会发生这种情况。

如果您不关心关系,只需删除id:的术语,此解决方案也适用

select * from
student s
inner join student_marks sm1 on s.student_id = sm1.student_id
left outer join student_marks sm2 on s.student_id = sm2.student_id 
and sm1.marks < sm2.marks
where sm2.student_id IS NULL;

使用NOT EXISTS获得每个学生的最高marks

select sm.*
from student_marks sm
where not exists (select 1 from student_marks where student_id = sm.student_id and marks > sm.marks)

然后加入表student:

select s.*, t.subject, t.marks
from student s
inner join (
select sm.*
from student_marks sm
where not exists (select 1 from student_marks where student_id = sm.student_id and marks > sm.marks)
) t on t.student_id = s.student_id 

请参阅演示
结果:

| student_id | name   | Age | subject      | marks |
| ---------- | ------ | --- | ------------ | ----- |
| S1         | Biden  | 15  | Trigonometry | 90    |
| S2         | Jordan | 16  | Geology      | 75    |

最新更新