这两个表来自所谓的类似Twitter的数据库的一部分,用户可以在其中关注其他用户。User.name字段是唯一的。
mysql> select uID, name from User;
+-----+-------------------+
| uID | name |
+-----+-------------------+
| 1 | Alice |
| 2 | Bob |
| 5 | Iron Maiden |
| 4 | Judas Priest |
| 6 | Lesser Known Band |
| 3 | Metallica |
+-----+-------------------+
6 rows in set (0.00 sec)
mysql> select * from Follower;
+-----------+------------+
| subjectID | observerID |
+-----------+------------+
| 3 | 1 |
| 4 | 1 |
| 5 | 1 |
| 6 | 1 |
| 3 | 2 |
| 4 | 2 |
| 5 | 2 |
+-----------+------------+
7 rows in set (0.00 sec)
mysql> call newFollowSuggestionsForName('Bob');
+-------------------+
| name |
+-------------------+
| Lesser Known Band |
+-------------------+
1 row in set (0.00 sec)
我想做一个操作,为用户X建议一个他们可能感兴趣的用户列表。我认为一种启发式方法可以为用户y所关注的所有y显示X,其中X和y至少关注3个相同的用户。下面是我为此而想出的SQL。我的问题是,是否可以通过其他方式更有效或更好地完成这项工作。
DELIMITER //
CREATE PROCEDURE newFollowSuggestionsForName(IN in_name CHAR(60))
BEGIN
DECLARE xuid INT;
SET xuid = (select uID from User where name=in_name);
select name
from User, (select subjectID
from follower
where observerID in (
select observerID
from Follower
where observerID<>xuid and subjectID in (select subjectID from Follower where observerID=xuid)
group by observerID
having count(*)>=3
)
) as T
where uID = T.subjectID and not exists (select * from Follower where subjectID=T.subjectID and observerID=xuid);
END //
DELIMITER ;
考虑在存储过程中使用以下重构的SQL代码(未经测试,没有数据(。
select u.`name`
from `User` u
inner join
(select subf.observerID, subf.subjectID
from follower subf
where subf.observerID <> xuid
) f
on u.UID = f.subjectID
inner join
(select f1.observerID
from follower f1
inner join follower f2
on f1.subjectID = f2.subjectID
and f1.observerID <> xuid
and f2.observerID = xuid
group by f1.observerID
having count(*) >= 3
) o
on f.observerID = o.observerID
我认为基本查询从获取所有与给定观察者共享三个"主题"的"观察者"开始:
select f.observerid
from followers f join
followers f2
on f.subjectid = f2.subjectid and
f2.observerid = 2
group by f.observerid
having count(*) = 3;
查询的其余部分只是加入名称,以适应使用名称作为引用而不是id的范例。