为什么在具有多个联接(包括外部联接)的查询上使用用户定义函数的查询速度较慢?使用该函数的原因是修改字符串,使其按数字排序。对字符串进行排序意味着100 < 99
。函数将重新格式化99 as 099
。因此,099 < 100
。它允许其他非数字值保持不变。
问题查询在具有联接的查询上使用函数。返回100行需要27秒。同一个查询,有一个函数,但超过一个表需要子秒。使用联接对查询进行SQL替换是次要的。在具有联接的查询上没有函数的查询是次条件查询。主表是tblTests,共517行。函数操作的列是一个文本列fldPurity。
tblTests
fldTestsID autonumber
fldPurity Text. field size 50. Indexed (Duplicates OK). Zero Length No.
这是功能代码。请注意不同的输入。
Public Function SortablePercent(ByVal pVar As Variant) As String
'------------------------------------------------------------------
' Purpose: Formats a string that may contain numbers or text values.
' The string percent may contain % or + characters. Ignore
' those characters during comparison. A string may start with numeric
' characters, but end with alpha characters. Compute the length of the resulting
' numeric characters. Length 3 is 100, no change. Prepend leading zeros to length 2
' or 1 numerics. Do not add prepend to values starting with text.
' Coded by: 2013-08-05 Henry Helgen
' Arguments: pVar: The string to be formatted.
' To Test: From the debug (immediate) window:
' X = "97+%"
' ? SortablePercent(X)
' 097
' X = "98"
' ? SortablePercent(X)
' 098
' X = "99.9"
' ? SortablePercent(X)
' 099.9
' X = "100"
' Print SortablePercent(X)
' 100
' X = "Reagent Grade"
' ? SortablePercent(X)
' Reagent Grade
' X = "85% & 15% H2O"
' ? SortablePercent(X)
' 085 & 15 H2O
'------------------------------------------------------------------
Dim strHold As String 'working string
Dim lenNum As Integer 'length of leading integer portion of number
' remove whitespace, %, + characters
strHold = Replace(Replace(Nz(Trim(pVar), ""), "%", ""), "+", "")
If IsNumeric(strHold) Then 'the entire string is numeric
lenNum = Len(CStr(Int(strHold)))
'Fill with leading zeros
strHold = Switch(lenNum = 3, strHold, lenNum = 2, "0" & strHold, lenNum = 1, "00" & strHold)
ElseIf IsNumeric(Left(strHold, 2)) Then '
strHold = "0" & strHold
ElseIf IsNumeric(Left(strHold, 1)) Then
strHold = "00" & strHold
End If 'numeric
SortablePercent = strHold
End Function
以下是使用函数over查询的慢速查询(27秒)
SELECT parm_TestConcatReferenceDatasetExposure.fldPurity,
SortablePercent([fldPurity]) AS temp2, Count(*) AS RcdCount
FROM parm_TestConcatReferenceDatasetExposure
GROUP BY parm_TestConcatReferenceDatasetExposure.fldPurity,
SortablePercent([fldPurity])
ORDER BY SortablePercent([fldPurity]);
这里是使用一个表上的功能(<1秒)的快速查询
SELECT tblTests.fldPurity,
SortablePercent([fldPurity]) AS temp2,
Count(*) AS RcdCount
FROM tblTests
GROUP BY tblTests.fldPurity, SortablePercent([fldPurity])
ORDER BY SortablePercent([fldPurity]);
这是一个没有函数的快速查询,而带有联接(<1秒)的查询
SELECT parm_TestConcatReferenceDatasetExposure.fldPurity,
Count(*) AS RcdCount
FROM parm_TestConcatReferenceDatasetExposure
GROUP BY parm_TestConcatReferenceDatasetExposure.fldPurity;
以下是使用SQL对一个表(<1秒)上的函数进行部分近似的快速查询
SELECT tblTests.fldPurity,
IIf(IsNumeric(Replace(Replace(Nz(Trim([fldPurity]),""),"%",""),"+","")),CDbl(Replace(Replace(Trim([fldPurity]),"%",""),"+","")),Trim([fldPurity])) AS tempPurity,
Count(*) AS RcdCount
FROM tblTests
GROUP BY tblTests.fldPurity
ORDER BY IIf(IsNumeric(Replace(Replace(Nz(Trim([fldPurity]),""),"%",""),"+","")),CDbl(Replace(Replace(Trim([fldPurity]),"%",""),"+","")),Trim([fldPurity]));
这是带有联接的查询
SELECT q_Test.fldTestsID, q_DatasetTreatment.fldDatasetsID,
q_DatasetTreatment.fldExposureEffectsID, q_Test.fldValidated,
q_Test.fldPollutantID, q_Test.fldPollutantName, q_Test.fldPollutantCAS,
q_Test.fldModeOfActionID, q_Test.fldModeOfAction, q_Test.fldPollutantTypeID,
q_Test.fldPollutantType, q_Test.fldSpeciesID, q_Test.fldClass, q_Test.fldGenus,
q_Test.fldSpecies, q_Test.fldCommonName, q_Test.fldTestTypeID,
q_Test.fldTestType, q_Test.fldTechniqueID, q_Test.fldTechnique,
q_Test.fldConcUnits, q_Test.fldDescription AS fldConcUnitDescription,
q_Test.fldMRID, q_Test.fldCETISID, q_Test.fldHardness, q_Test.fldSalinity,
q_Test.fldpH, q_Test.fldTemperature, q_Test.fldPurity, q_Test.fldDO,
q_Test.fldAcute, q_Test.fldUser, q_Test.fldComments,
IIf([q_sumTestReference].[fldTestsID] Is Not Null,[ConcatRef],"") AS CombinedRef,
q_DatasetTreatment.fldBiolVarNameID, q_DatasetTreatment.fldBiolVarName,
q_DatasetTreatment.fldLifeStageID, q_DatasetTreatment.fldLifeStage,
q_DatasetTreatment.fldDataTypeID, q_DatasetTreatment.fldDataType,
q_DatasetTreatment.fldGenerationID, q_DatasetTreatment.fldGeneration,
q_DatasetTreatment.fldEffectTypeID, q_DatasetTreatment.fldEffectType,
q_DatasetTreatment.fldDurationDays, q_DatasetTreatment.fldBVUnits,
q_DatasetTreatment.fldDescription AS fldBVUnitDescription,
q_DatasetTreatment.fldReportedNOEC, q_DatasetTreatment.fldReportedLOEC,
q_DatasetTreatment.fldTreatmentNum, q_DatasetTreatment.fldControlTypeID,
q_DatasetTreatment.fldControlType, q_DatasetTreatment.fldReplicateNum,
q_DatasetTreatment.fldPseudoReplicateNum, q_DatasetTreatment.fldNumberExposed,
q_DatasetTreatment.fldMeasuredConcentration,
q_DatasetTreatment.fldNominalConcentration, q_DatasetTreatment.fldBiolVarValue
FROM q_sumTestReference
RIGHT JOIN (q_Test
LEFT JOIN q_DatasetTreatment
ON q_Test.fldTestsID = q_DatasetTreatment.fldTestsID)
ON q_sumTestReference.fldTestsID = q_Test.fldTestsID
ORDER BY q_Test.fldTestsID, q_DatasetTreatment.fldDatasetsID,
q_DatasetTreatment.fldTreatmentNum, q_DatasetTreatment.fldReplicateNum;
我想使用这个函数,因为它是更简单、更干净的代码。有什么建议吗?我在这篇关于SQL Server用户定义函数的文章中看到,它逐行计算。这是否意味着SQL查询中的复杂解析语句(如我的第4个示例)是正确的?
HansUp的评论起了作用:
如果您有一个包含数字的字符串,请使用
Val(your_string)
使其可以按数字顺序排序。我不确定您是否需要UDF,并且内置的Val()
函数可以使查询更快。
以下是工作查询:
SELECT parm_TestConcatReferenceDatasetExposure.fldPurity,
Val(Replace(Replace(Nz(Trim([fldPurity]),""),"%",""),"+","")) AS temp5,
Count(*) AS RcdCount
FROM parm_TestConcatReferenceDatasetExposure
GROUP BY parm_TestConcatReferenceDatasetExposure.fldPurity,
Val(Replace(Replace(Nz(Trim([fldPurity]),""),"%",""),"+",""))
ORDER BY Val(Replace(Replace(Nz(Trim([fldPurity]),""),"%",""),"+",""));