我有一个相当复杂的MySQL查询正在另一个开发人员的数据库上运行。我正试图将他的数据复制到我们新的数据库结构中,所以我运行这个查询来获取要复制的数据。主表大约有45000行。
从下面的查询中可以看到,有很多字段来自几个不同的表。我的问题是字段Ref.refno
(作为ref_id
(在某些情况下被拉过两到三次。这是因为在表LandlordOnlineRef
(LLRef
(中,有时有多行具有相同的参考号——在这种情况下,因为行本应被编辑,但却被复制了。。。
以下是我尝试过的操作:-
SELECT DISTINCT(Ref.refno) [...]
-这对输出没有任何影响,尽管我认为它会停止选择重复的refno
ID-
这是MySQL的错误,还是我的错误-我还尝试将
GROUP BY ref_id
添加到查询的末尾。查询通常需要几毫秒才能运行,但当我将GROUP BY
添加到末尾时,它似乎会无限运行——我等了几分钟,但什么都没发生。我觉得它可能很难,因为我使用的是LIMIT 1000
,所以我也尝试了LIMIT 10
,但仍然得到了同样的效果
这是问题查询-谢谢!
SELECT
-- progress
Ref.refno AS ref_id,
Ref.tenantid AS tenant_id,
Ref.productid AS product_id,
Ref.guarantorid AS guarantor_id,
Ref.agentid AS agent_id,
Ref.companyid AS company_id,
Ref.status AS status,
Ref.startdate AS ref_start_date,
Ref.enddate AS ref_end_date,
-- ReferenceDetails
RefDetails.creditscore AS credit_score,
-- LandlordOnlineRef
LLRef.propaddress AS prev_ll_address,
LLRef.rent AS prev_ll_rent,
LLRef.startdate AS prev_ll_start_date,
LLRef.enddate AS prev_ll_end_date,
LLRef.arrears AS prev_ll_arrears,
LLRef.arrearsreason AS prev_ll_arrears_reason,
LLRef.propertycondition AS prev_ll_property_condition,
LLRef.conditionreason AS prev_ll_condition_reason,
LLRef.consideragain AS prev_ll_consider_again,
LLRef.completedby AS prev_ll_completed_by,
LLRef.contactno AS prev_ll_contact_no,
LLRef.landlordagent AS prev_ll_or_agent,
-- EmpDetails
EmpRef.cempname AS emp_name,
EmpRef.cempadd1 AS emp_address_1,
EmpRef.cempadd2 AS emp_address_2,
EmpRef.cemptown AS emp_address_town,
EmpRef.cempcounty AS emp_address_county,
EmpRef.cemppostcode AS emp_address_postcode,
EmpRef.ctelephone AS emp_telephone,
EmpRef.cemail AS emp_email,
EmpRef.ccontact AS emp_contact,
EmpRef.cgross AS emp_income,
EmpRef.cyears AS emp_years,
EmpRef.cmonths AS emp_months,
EmpRef.cposition AS emp_position,
-- EmpLlodReference
ELRef.lod_ref_status AS prev_ll_status,
ELRef.lod_ref_email AS prev_ll_email,
ELRef.lod_ref_tele AS prev_ll_telephone,
ELRef.emp_ref_status AS emp_status,
ELRef.emp_ref_tele AS emp_telephone,
ELRef.emp_ref_email AS emp_email
FROM ReferenceDetails AS RefDetails
LEFT JOIN progress AS Ref ON Ref.refno
LEFT JOIN LandlordOnlineRef AS LLRef ON LLRef.refno = Ref.refno
LEFT JOIN EmpLlodReference AS ELRef ON ELRef.refno = Ref.refno
LEFT JOIN EmpDetails AS EmpRef ON EmpRef.tenantid = Ref.tenantid
-- For testing purposes to speed things up, limit it to 1000 rows
LIMIT 1000
LEFT JOIN progress AS Ref ON Ref.refno
基本上会把它变成笛卡尔连接。您不是在进行显式比较,而是在说"连接所有有非null值的记录"。
不应该是吗
LEFT JOIN progress AS Ref ON Ref.refno = RefDetails.something
将所有选定的列放入
DISTINCT
中,用,
分隔。如果要保留重命名,请将另一个SELECT DISTINCT(*) FROM (YOUR_SELECT)
包裹起来。GROUP BY
子句中的列上是否有索引?在CCD_ 16之后施加CCD_。因此,限制不会影响查询运行时。
一般规则是永远不要按必须的列进行分组。在返回重复行的表上使用带有groupby的子查询来消除它们。
更改:
LEFT JOIN LandlordOnlineRef AS LLRef ON LLRef.refno = Ref.refno
至:
Left Join (select refno
, othercolumns you need
from LandlordOnlineRef
group by refno,othercolumn) as LLRef
不确定要在此处包含哪些列,但在任何表级别,都可以将该表更改为子查询,以在联接之前消除重复行。正如MarkBannister所说,您需要一些逻辑来识别LLRef中的唯一refno。您还可以使用"最近"的日期列或您能想到的任何其他逻辑来获取唯一的LLRef和与该记录相关的信息。
呃,自动拼写检查正在将refno更改为精化。ha