我正在尝试改进一个执行以下操作的查询:
对于每项工作,将所有成本相加,将发票金额相加,然后计算利润/损失。成本来自几个不同的表格,例如采购订单、用户事件(工程师分配的时间/他在现场花费的时间)、使用的库存等。
查询还需要输出一些其他列,如工作的站点名称,以便该列可以按排序(所有这些之后都会附加ORDER by)。
SELECT
jobs.job_id,
jobs.start_date,
jobs.end_date,
events.time,
sites.name site,
IFNULL(stock_cost,0) stock_cost,
labour,
materials,
labour+materials+plant+expenses revenue,
(labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) profit,
((labour+materials+plant)-(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)))/(time*3557/360000+IFNULL(orders_cost,0)+IFNULL(stock_cost,0)) ratio
FROM
jobs
LEFT JOIN (
SELECT
job_id,
SUM(labour_charge) labour,
SUM(materials_charge) materials,
SUM(plant_hire_charge) plant,
SUM(expenses) expenses
FROM invoices
GROUP BY job_id
ORDER BY NULL
) invoices USING(job_id)
LEFT JOIN (
SELECT
job_id,
SUM(IF(start_onsite && end_onsite,end_onsite-start_onsite,end-start)) time,
SUM(travel+parking+materials) user_expenses
FROM users_events
WHERE type='job'
GROUP BY job_id
ORDER BY NULL
) events USING(job_id)
LEFT JOIN (
SELECT
job_id,
SUM(IFNULL(total,0))*0.01 orders_cost
FROM purchaseorders
GROUP BY job_id
ORDER BY NULL
) purchaseorders USING(job_id)
LEFT JOIN (
SELECT
location job_id,
SUM(amount*cost))*0.01 stock_cost
FROM stock_location
LEFT JOIN stock_items ON stock_items.id=stock_location.stock_id
WHERE location>=3000 AND amount>0 AND cost>0
GROUP BY location
ORDER BY NULL
) stock USING(job_id)
LEFT JOIN contacts_sites sites ON sites.id=jobs.site_id;
我读到:http://dev.mysql.com/doc/refman/5.0/en/group-by-optimization.html但我不知道该如何/是否可以在其中应用任何东西。出于测试目的,我尝试在左、右和中间字段添加各种索引,但对EXPLAIN输出没有任何改进:
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
| 1 | PRIMARY | jobs | ALL | NULL | NULL | NULL | NULL | 7088 | |
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 5038 | |
| 1 | PRIMARY | <derived3> | ALL | NULL | NULL | NULL | NULL | 6476 | |
| 1 | PRIMARY | <derived4> | ALL | NULL | NULL | NULL | NULL | 904 | |
| 1 | PRIMARY | <derived5> | ALL | NULL | NULL | NULL | NULL | 531 | |
| 1 | PRIMARY | sites | eq_ref | PRIMARY | PRIMARY | 4 | bestbee_db.jobs.site_id | 1 | |
| 5 | DERIVED | stock_location | ALL | stock,location,amount,…| NULL | NULL | NULL | 5426 | Using where; Using temporary; |
| 5 | DERIVED | stock_items | eq_ref | PRIMARY | PRIMARY | 4 | bestbee_db.stock_location.stock_id | 1 | Using where |
| 4 | DERIVED | purchaseorders | ALL | NULL | NULL | NULL | NULL | 1445 | Using temporary; |
| 3 | DERIVED | users_events | ALL | type,type_job | NULL | NULL | NULL | 11295 | Using where; Using temporary; |
| 2 | DERIVED | invoices | ALL | NULL | NULL | NULL | NULL | 5320 | Using temporary; |
+----+-------------+----------------+--------+------------------------+---------+---------+------------------------------------+-------+-------------------------------+
生成的行数为5 x 10^21(低于我开始优化此查询之前的3 x 10^42!)
目前执行需要7秒(低于26秒),但我希望它在1秒以下。
顺便说一句:GROUP By x ORDER By NULL是从子查询中消除不必要的文件端口的好方法!(来自http://www.mysqlperformanceblog.com/2006/09/04/group_concat-useful-group-by-extension/)
根据您对我的问题的评论,我会做以下。。。
在最顶端。。。
选择STRIGHT_JOIN(只需添加"STRIGH_JOIN"关键字)
然后,对于发票、事件、p/o等的每个子查询,将ORDER BY显式更改为JOB_ID,这样可能有助于针对主JOBS表联接进行优化。
最后,确保每个子查询表在Job_ID(Invoices、User_events、PurchaseOrders、Stock_Location)上都有索引
此外,对于Stock_Location表,您可能希望通过在
上设置复合索引来帮助子查询的WHERE子句(job_id,location,amount)即使您有key加3 where condition元素,三个字段的深度也应该足够了。