我有下面的mySQL SELECT语句,它在一个小数据集上工作正常,但当容量增加时死亡:
SELECT DISTINCT Bookings.BookingId, Bookings.ResortId, Bookings.WeekBeginning, Bookings.DepartDate, Bookings.CancelledDate,Clients.FirstName, Clients.LastName, Clients.Email, Clients.Address1, Clients.City, Clients.State, Clients.CountryId, Clients.ClientType, Countries.Country, BookingAccommodation.AccomId, BookingAccommodation.ShareType, BookingProgram.ProgramId, Programs.ProgramDesc
FROM Bookings, Clients, BookingProgram, BookingAccommodation, Countries, ClientType, Programs
WHERE Bookings.BookingId = BookingProgram.BookingId
AND Bookings.BookingId = BookingAccommodation.BookingId
AND Bookings.WeekBeginning >= '2016-10-01'
AND BookingAccommodation.Nights > 0
AND Clients.ClientId = Bookings.ClientId
AND Clients.Email <> ''
AND Clients.CountryId = Countries.CountryId
AND Programs.ProgramId = BookingProgram.ProgramId
在booking中大约有10K条记录,在BookingAccommodation和BookingPrograms中各有25K条记录,容量并不大,但查询运行了950秒。我在本地MAMP服务器上的phpAdmin的SQL窗口中运行查询。
将其分成3个查询,每个查询的结果在几分之一秒内返回:
SELECT DISTINCT Bookings.BookingId, Bookings.ResortId, Bookings.WeekBeginning, Bookings.DepartDate, Bookings.CancelledDate, Clients.FirstName, Clients.LastName, Clients.Email, Clients.Address1, Clients.City, Clients.State, Clients.CountryId, Clients.ClientType, Countries.Country
FROM Bookings, Clients, Countries, ClientType
WHERE Bookings.WeekBeginning >= '2016-10-01'
AND Clients.ClientId = Bookings.ClientId
AND Clients.Email <> ''
AND Clients.CountryId = Countries.CountryId
SELECT DISTINCT Bookings.BookingId, BookingAccommodation.AccomId, BookingAccommodation.ShareType
FROM Bookings, BookingAccommodation
WHERE Bookings.BookingId = BookingAccommodation.BookingId
AND Bookings.WeekBeginning >= '2016-10-01'
AND BookingAccommodation.Nights > 0
SELECT DISTINCT Bookings.BookingId, BookingProgram.ProgramId, Programs.ProgramDesc
FROM Bookings, BookingProgram, Programs
WHERE Bookings.BookingId = BookingProgram.BookingId
AND Bookings.WeekBeginning >= '2016-10-01'
AND Programs.ProgramId = BookingProgram.ProgramId
对于预订中的每个记录,BookingAccommodation和BookingProgram中有多个记录,但我只需要每个记录中的一个记录,因此使用SELECT DISTINCT。
- 预订的主键是BookingId。
- BookingAccommodation的主键是BookingId, AccomDate, AccomId
- BookingProgram的主键是BookingId, ProgramId, AccomType
我试图用连接和子查询重写查询,但我显然没有做对。我如何将这3个查询连接回一个执行良好的查询?
这些是使用子查询而不是连接(MySQL假设FWIW)的基础。为伪代码道歉,我认为尽快回答是很重要的,因为这是我刚刚遇到的这个问题的热门话题之一。
一个客户预订了一艘游轮。病人还应说明他们的饮食(如:素食主义者,素食主义者,不吃大豆等)。因此,我们有三个表:
预订Booking_Id, Booking_Date, Booking_Time, Client_Id
客户
Client_Id, Client_Name, Client_Phone, Client_DietId
饮食Diet_Id, Diet_Name
我们现在要向礼宾部展示一个完整的预订视图。
使用"连接": SELECT Bookings.Booking_Id, Bookings.Booking_Date, Bookings.Booking_Time, Clients.Client_Name, Diets.Diet_Name
FROM Bookings
INNER JOIN Clients
ON Bookings.Client_Id = Clients.Client_Id
INNER JOIN Diets
ON Clients.Client_DietId = Diets.Diet_Id
我认为它是在那些单独的join中创建"临时表"——当然"临时表"可能是也可能不是准确的底层实现,等等,但有趣的是子查询可能比大型连接更快(其他线程)。
我想在上面的例子中做单独的连接:
首先,我需要将客户与他们的饮食联系起来,然后我将"表"与预订联系起来。
因此,我以这样结束(注意引用子查询时表(重新)命名): SELECT [RELEVANT FIELDS HERE ETC]
FROM
(SELECT Clients.Client_Id, Clients.Client_Name, Diets.Diet_Name
FROM Clients
INNER JOIN Diets
ON Clients.Client_DietId = Diets.Diet_Id)
AS ClientDetailsWithDiets
INNER JOIN Bookings
ON Bookings.Booking_Id = ClientDetailsWithDiets.Client_Id
现在,如果另一个表要加入,比如Staff分配给一个特定的Booking,那么上面的整个内容将被嵌套,等等,例如:
SELECT [RELEVANT FIELDS HERE ETC]
FROM
(SELECT [RELEVANT FIELDS HERE ETC]
FROM
(SELECT Clients.Client_Id, Clients.Client_Name, Diets.Diet_Name
FROM Clients
INNER JOIN Diets
ON Clients.Client_DietId = Diets.Diet_Id)
AS ClientDetailsWithDiets
INNER JOIN Bookings
ON Bookings.Booking_Id = ClientDetailsWithDiets.Client_Id)
AS BookingDetailsFull
INNER JOIN Staff
ON BookingDetailsFull.Booking_Id = Staff.Booking_Id_Assigned
尝试更改为
SELECT DISTINCT Bookings.BookingId, Bookings.ResortId,
Bookings.WeekBeginning, Bookings.DepartDate, Bookings.CancelledDate,
Clients.FirstName, Clients.LastName, Clients.Email, Clients.Address1,
Clients.City, Clients.State, Clients.CountryId, Clients.ClientType, Countries.Country,
BookingAccommodation.AccomId, BookingAccommodation.ShareType, BookingProgram.ProgramId,
Programs.ProgramDesc
FROM Bookings
JOIN Clients ON Clients.ClientId = Bookings.ClientId AND Bookings.WeekBeginning >= '2016-10-01' AND Clients.Email <> ''
JOIN BookingProgram ON Bookings.BookingId = BookingProgram.BookingId
JOIN BookingAccommodation ON Bookings.BookingId = BookingAccommodation.BookingId AND BookingAccommodation.Nights > 0
JOIN Countries ON Clients.CountryId = Countries.CountryId
JOIN Programs ON Programs.ProgramId = BookingProgram.ProgramId
WHERE Bookings.WeekBeginning >= '2016-10-01';
如果没有得到你想要的结果,尝试EXPLAIN并查看查询计划。
请注意:我没有看到表ClientType被使用的任何地方,所以我没有包括它在join
与其花更多的时间尝试改进select语句,因为它会碰到很多表,我选择将其分割成我在原始问题中概述的单独查询。