如何优化mysql等连接查询的速度



我已经将示例表上传到dropbox中
以表名余额加载到mysql中
用表名收入加载到mysql
显示数据结构:

describe balance;
+----------------------+--------+------+-----+---------+-------+
| Field                | Type   | Null | Key | Default | Extra |
+----------------------+--------+------+-----+---------+-------+
| code                 | text   | YES  |     | NULL    |       |
| report_date          | date   | YES  |     | NULL    |       |
| total_assets         | double | YES  |     | NULL    |       |
| total_owner_equities | double | YES  |     | NULL    |       |
+----------------------+--------+------+-----+---------+-------+
4 rows in set (0.00 sec)
describe income;
+--------------------------+--------+------+-----+---------+-------+
| Field                    | Type   | Null | Key | Default | Extra |
+--------------------------+--------+------+-----+---------+-------+
| code                     | text   | YES  |     | NULL    |       |
| report_date              | date   | YES  |     | NULL    |       |
| total_operating_revenue  | double | YES  |     | NULL    |       |
| net_profit               | double | YES  |     | NULL    |       |
| np_parent_company_owners | double | YES  |     | NULL    |       |
+--------------------------+--------+------+-----+---------+-------+
5 rows in set (0.01 sec)

两张表中都有一条小记录。

select count(*) from income;
+----------+
| count(*) |
+----------+
|    51250 |
+----------+
1 row in set (0.06 sec)
select count(*) from balance;
+----------+
| count(*) |
+----------+
|    50832 |
+----------+
1 row in set (0.05 sec)

执行相等连接查询还需要4分钟。

select a.code ,a.report_date ,a.total_assets ,a.total_owner_equities,  
b.total_operating_revenue,b.net_profit ,b.np_parent_company_owners 
from balance as a ,income  as b          
where   a.code=b.code and a.report_date=b.report_date;

50793 rows in set (4 min 5.15 sec)

在这种情况下,如何优化mysql等连接查询的速度?

说明

查询花费这么长时间的原因是它试图将第一个表中的每个记录与第二个表中每个记录进行匹配。51250*50832=2605140000张支票。

通过缩小范围,可以显著提高查询速度。


代码

select a.code ,a.report_date ,a.total_assets ,a.total_owner_equities,  
b.total_operating_revenue,b.net_profit ,b.np_parent_company_owners 
from balance as a ,income  as b          
where   a.code=b.code and a.report_date=b.report_date
and a.report_date between date1 and date2
and b.report_date between samedate1 and samedate2;

为了提高查询性能,您需要对表进行几项更改:

  1. 如果没有必要,请不要使用文本字段。所以我的意思是code字段可以改为VARCHAR而不是TEXT
  2. 为联接中使用的字段添加索引

所以你需要:

ALTER TABLE `TableA` 
CHANGE `code` VARCHAR(256),
ADD INDEX `code_idx` (`code`),
ADD INDEX `report_date_idx` (`report_date`);
ALTER TABLE `TableB` 
CHANGE `code` VARCHAR(256),
ADD INDEX `code_idx` (`code`),
ADD INDEX `report_date_idx` (`report_date`);

最新更新