如何提高连接多个表的SELECT性能



我有下面的mySQL SELECT语句,它在一个小数据集上工作正常,但当容量增加时死亡:

SELECT DISTINCT Bookings.BookingId, Bookings.ResortId, Bookings.WeekBeginning, Bookings.DepartDate, Bookings.CancelledDate,Clients.FirstName, Clients.LastName, Clients.Email, Clients.Address1, Clients.City, Clients.State, Clients.CountryId, Clients.ClientType, Countries.Country, BookingAccommodation.AccomId, BookingAccommodation.ShareType, BookingProgram.ProgramId, Programs.ProgramDesc
    FROM Bookings, Clients, BookingProgram, BookingAccommodation, Countries, ClientType, Programs
    WHERE Bookings.BookingId = BookingProgram.BookingId
       AND Bookings.BookingId = BookingAccommodation.BookingId
       AND Bookings.WeekBeginning >= '2016-10-01' 
       AND BookingAccommodation.Nights > 0
       AND Clients.ClientId = Bookings.ClientId
       AND Clients.Email <> ''
       AND Clients.CountryId = Countries.CountryId
       AND Programs.ProgramId = BookingProgram.ProgramId

在booking中大约有10K条记录,在BookingAccommodation和BookingPrograms中各有25K条记录,容量并不大,但查询运行了950秒。我在本地MAMP服务器上的phpAdmin的SQL窗口中运行查询。

将其分成3个查询,每个查询的结果在几分之一秒内返回:

SELECT DISTINCT Bookings.BookingId, Bookings.ResortId, Bookings.WeekBeginning, Bookings.DepartDate, Bookings.CancelledDate, Clients.FirstName, Clients.LastName, Clients.Email, Clients.Address1, Clients.City, Clients.State, Clients.CountryId, Clients.ClientType, Countries.Country
     FROM Bookings, Clients, Countries, ClientType
    WHERE Bookings.WeekBeginning >= '2016-10-01' 
       AND Clients.ClientId = Bookings.ClientId
       AND Clients.Email <> ''
       AND Clients.CountryId = Countries.CountryId
SELECT DISTINCT Bookings.BookingId, BookingAccommodation.AccomId, BookingAccommodation.ShareType
    FROM Bookings, BookingAccommodation
    WHERE Bookings.BookingId = BookingAccommodation.BookingId
       AND Bookings.WeekBeginning >= '2016-10-01' 
       AND BookingAccommodation.Nights > 0
SELECT DISTINCT Bookings.BookingId, BookingProgram.ProgramId, Programs.ProgramDesc
    FROM Bookings, BookingProgram, Programs
    WHERE Bookings.BookingId = BookingProgram.BookingId
       AND Bookings.WeekBeginning >= '2016-10-01' 
       AND Programs.ProgramId = BookingProgram.ProgramId

对于预订中的每个记录,BookingAccommodation和BookingProgram中有多个记录,但我只需要每个记录中的一个记录,因此使用SELECT DISTINCT。

  • 预订的主键是BookingId。
  • BookingAccommodation的主键是BookingId, AccomDate, AccomId
  • BookingProgram的主键是BookingId, ProgramId, AccomType

我试图用连接和子查询重写查询,但我显然没有做对。我如何将这3个查询连接回一个执行良好的查询?

这些是使用子查询而不是连接(MySQL假设FWIW)的基础。为伪代码道歉,我认为尽快回答是很重要的,因为这是我刚刚遇到的这个问题的热门话题之一。

一个客户预订了一艘游轮。病人还应说明他们的饮食(如:素食主义者,素食主义者,不吃大豆等)。因此,我们有三个表:


预订Booking_Id, Booking_Date, Booking_Time, Client_Id


客户

Client_Id, Client_Name, Client_Phone, Client_DietId


饮食Diet_Id, Diet_Name

我们现在要向礼宾部展示一个完整的预订视图。

使用"连接":

SELECT Bookings.Booking_Id, Bookings.Booking_Date, Bookings.Booking_Time, Clients.Client_Name, Diets.Diet_Name FROM Bookings INNER JOIN Clients ON Bookings.Client_Id = Clients.Client_Id INNER JOIN Diets ON Clients.Client_DietId = Diets.Diet_Id

使用子查询:

我认为它是在那些单独的join中创建"临时表"——当然"临时表"可能是也可能不是准确的底层实现,等等,但有趣的是子查询可能比大型连接更快(其他线程)。

我想在上面的例子中做单独的连接:

首先,我需要将客户与他们的饮食联系起来,然后我将"表"与预订联系起来。

因此,我以这样结束(注意引用子查询时表(重新)命名):

SELECT [RELEVANT FIELDS HERE ETC] FROM (SELECT Clients.Client_Id, Clients.Client_Name, Diets.Diet_Name FROM Clients INNER JOIN Diets ON Clients.Client_DietId = Diets.Diet_Id) AS ClientDetailsWithDiets INNER JOIN Bookings ON Bookings.Booking_Id = ClientDetailsWithDiets.Client_Id

现在,如果另一个表要加入,比如Staff分配给一个特定的Booking,那么上面的整个内容将被嵌套,等等,例如:

SELECT [RELEVANT FIELDS HERE ETC] FROM (SELECT [RELEVANT FIELDS HERE ETC] FROM (SELECT Clients.Client_Id, Clients.Client_Name, Diets.Diet_Name FROM Clients INNER JOIN Diets ON Clients.Client_DietId = Diets.Diet_Id) AS ClientDetailsWithDiets INNER JOIN Bookings ON Bookings.Booking_Id = ClientDetailsWithDiets.Client_Id) AS BookingDetailsFull INNER JOIN Staff ON BookingDetailsFull.Booking_Id = Staff.Booking_Id_Assigned

尝试更改为

SELECT DISTINCT Bookings.BookingId, Bookings.ResortId, 
Bookings.WeekBeginning, Bookings.DepartDate, Bookings.CancelledDate,
Clients.FirstName, Clients.LastName, Clients.Email, Clients.Address1, 
Clients.City, Clients.State, Clients.CountryId, Clients.ClientType, Countries.Country,
BookingAccommodation.AccomId, BookingAccommodation.ShareType, BookingProgram.ProgramId,
Programs.ProgramDesc
    FROM Bookings
    JOIN Clients ON Clients.ClientId = Bookings.ClientId AND Bookings.WeekBeginning >= '2016-10-01' AND Clients.Email <> ''
    JOIN BookingProgram ON Bookings.BookingId = BookingProgram.BookingId
    JOIN BookingAccommodation ON Bookings.BookingId = BookingAccommodation.BookingId AND BookingAccommodation.Nights > 0
    JOIN Countries ON Clients.CountryId = Countries.CountryId
    JOIN Programs ON Programs.ProgramId = BookingProgram.ProgramId
    WHERE Bookings.WeekBeginning >= '2016-10-01';

如果没有得到你想要的结果,尝试EXPLAIN并查看查询计划。

请注意:我没有看到表ClientType被使用的任何地方,所以我没有包括它在join

与其花更多的时间尝试改进select语句,因为它会碰到很多表,我选择将其分割成我在原始问题中概述的单独查询。

最新更新