检查字符串中的每个字符是否存在于预定义的字符集中的最高性能方法是什么?



检查字符串中的每个字符是否存在于预定义的字符集中的最佳方法是什么? 预定义的字符集定义为"0123456789-AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz"

下面工作正常,但是有更好的方法来解决这个问题吗?

For X As Int32 = 0 To stringVal.Length - 1
If PredefinedCharacters.IndexOf(stringVal.Chars(X)) = -1 Then
'Invalid value
Return False
End If
Next X

使用String存储无效字符列表意味着在该String中对输入String中正在测试的每个Char进行线性搜索。 这是低效的。

HashSet提供了一种有效的方法来查找值是否存在于"一组"字符中:

Shared ReadOnly PredefinedCharacters As HashSet(Of Char) = New HashSet(Of Char)(
"0123456789-AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz"
)
Shared Function IsValid(stringVal As String) As Boolean
REM Optional: Reject null/empty input
REM If String.IsNullOrEmpty(stringVal) Then
REM     Return False
REM End If
For X As Int32 = 0 To stringVal.Length - 1
If Not PredefinedCharacters.Contains(stringVal.Chars(X)) Then
'Invalid value
Return False
End If
Next X
Return True
End Function

以下是我使用BenchmarkDotNet编写的一些代码,比较了HashSetRegex(编译和未编译)和其他一些方法的使用:

Option Explicit On
Option Strict On
Imports System.Text.RegularExpressions
Imports BenchmarkDotNet.Attributes
REM None should really be the default value of this enumeration,
REM but this is the only way to have it benchmarked last
Public Enum TextAdjustment
FirstCharInvalid
MiddleCharInvalid
None
End Enum
Public Class Benchmarks
Private Const ValidCharsString As String = "0123456789-AaBbCcDdEeFfGgHhIiJjKkLlMmNnOoPpQqRrSsTtUuVvWwXxYyZz"
Private Shared ReadOnly ValidCharsHashSet As New HashSet(Of Char)(ValidCharsString)
Private Const ValidCharsRegexString As String = "^[0-9A-Za-z-]+$"
Private Shared ReadOnly ValidCharsRegexInterpreted As New Regex(ValidCharsRegexString, RegexOptions.None)
Private Shared ReadOnly ValidCharsRegexCompiled As New Regex(ValidCharsRegexString, RegexOptions.Compiled)
Const AdjustmentInvalidChar As Char = "~"c
Shared Sub New()
REM Short text  (16 characters): Repeat <upper> <lower> <digit> <upper> <lower> <digit> <dash>
Const shortText As String = "Ab3De6-Hi0Kl3-Op"
REM Medium text (64 characters): <all upper> <dash> <all lower> <dash> <all digit>
Const mediumText As String = "ABCDEFGHIJKLMNOPQRSTUVWXYZ-abcdefghijklmnopqrstuvwxyz-1234567890"
Dim longText As String = String.Concat(Enumerable.Repeat(mediumText, 16))
Dim hugeText As String = String.Concat(Enumerable.Repeat(longText, 1024))
Dim allText As String() = {shortText, mediumText, longText, hugeText}
TextByLengthThenAdjustment = allText.ToDictionary(
Function(text) text.Length,
Function(text) [Enum].GetValues(GetType(TextAdjustment)) _
.Cast(Of TextAdjustment) _
.ToDictionary(
Function(adjustment) adjustment,
Function(adjustment) GetAdjustedText(text, adjustment)
)
)
TextLengths = Array.AsReadOnly(
allText.Select(Function(text) text.Length).ToArray()
)
End Sub
''' <summary>All benchmark text inputs accessed by length then by adjustment.</summary>
Private Shared ReadOnly Property TextByLengthThenAdjustment As Dictionary(Of Integer, Dictionary(Of TextAdjustment, String))
''' <summary>The lengths of all benchmark text inputs.</summary>
Public Shared ReadOnly Property TextLengths As IReadOnlyCollection(Of Integer)
''' <summary>The length of the current benchmark text input.</summary>
<ParamsSource(NameOf(TextLengths))>
Public Property TextLength As Integer
''' <summary>The adjustment applied to the current benchmark text input.</summary>
<ParamsAllValues()>
Public Property TextAdjustment As TextAdjustment
''' <summary>The current benchmark text input, based on the <c>TextLength</c> and <c>TextAdjustment</c> parameters.</summary>
Private Property Text As String
''' <summary>Gets the specified text with the specified adjustment applied.</summary>
''' <param name="text">The text to be adjusted.</param>
''' <param name="adjustment">The adjustment to apply.</param>
''' <returns>The adjusted text.</returns>
Private Shared Function GetAdjustedText(text As String, adjustment As TextAdjustment) As String
Select Case adjustment
Case TextAdjustment.None
Return text
Case TextAdjustment.FirstCharInvalid, TextAdjustment.MiddleCharInvalid
Dim textBuilder As New System.Text.StringBuilder(text, text.Length)
Dim invalidCharIndex As Integer = If(adjustment = TextAdjustment.FirstCharInvalid, 0, text.Length  2)
textBuilder(invalidCharIndex) = AdjustmentInvalidChar
Return textBuilder.ToString()
Case Else
Throw New ArgumentOutOfRangeException(
NameOf(adjustment), adjustment, $"Unsupported {NameOf(StackOverflow56011014.TextAdjustment)} value."
)
End Select
End Function
''' <summary>Loads the text for the upcoming benchmark based on the text parameters.</summary>
<GlobalSetup()>
Public Sub LoadBenchmarkText()
Text = TextByLengthThenAdjustment(TextLength)(TextAdjustment)
End Sub
<Benchmark(Baseline:=True)>
Public Function StringIndexOf() As Boolean
For X As Int32 = 0 To Text.Length - 1
If ValidCharsString.IndexOf(Text.Chars(X)) = -1 Then
'Invalid value
Return False
End If
Next X
Return True
End Function
<Benchmark()>
Public Function CharOperators() As Boolean
For X As Int32 = 0 To Text.Length - 1
Dim c = Text(X)
If Not (
c = "-"c _
OrElse (c >= "0"c AndAlso c <= "9"c) _
OrElse (c >= "A"c AndAlso c <= "Z"c) _
OrElse (c >= "a"c AndAlso c <= "z"c)
) Then
Return False
End If
Next X
Return True
End Function
<Benchmark()>
Public Function SelectCaseChars() As Boolean
For X As Int32 = 0 To Text.Length - 1
Select Case Text(X)
Case "-"c
Case "0"c To "9"c
Case "A"c To "Z"c
Case "a"c To "z"c
Continue For
Case Else
Return False
End Select
Next X
Return True
End Function
<Benchmark()>
Public Function SelectCaseIntegers() As Boolean
For X As Int32 = 0 To Text.Length - 1
Select Case Strings.AscW(Text(X))
Case 45        REM '-'
Case 48 To 57  REM '0' - '9'
Case 65 To 90  REM 'A' - 'Z'
Case 97 To 122 REM 'a' - 'z'
Continue For
Case Else
Return False
End Select
Next X
Return True
End Function
<Benchmark()>
Public Function CharIsLetterOrDigit() As Boolean
For X As Int32 = 0 To Text.Length - 1
Dim c = Text(X)
REM Reject non-ASCII letters for which Char.IsLetterOrDigit returns True
If Not (c = "-"c OrElse (Char.IsLetterOrDigit(c) AndAlso c <= "z"c)) Then
Return False
End If
Next X
Return True
End Function
<Benchmark()>
Public Function HashSetContains() As Boolean
For X As Int32 = 0 To Text.Length - 1
If Not ValidCharsHashSet.Contains(Text.Chars(X)) Then
Return False
End If
Next X
Return True
End Function
<Benchmark()>
Public Function RegexMatchInterpreted() As Boolean
Return ValidCharsRegexInterpreted.IsMatch(Text)
End Function
<Benchmark()>
Public Function RegexMatchCompiled() As Boolean
Return ValidCharsRegexCompiled.IsMatch(Text)
End Function
End Class
Public Class Program
Public Shared Sub Main()
BenchmarkDotNet.Running.BenchmarkRunner.Run(Of Benchmarks)()
End Sub
End Class

所有大写、小写和数字字符加上两个破折号恰好是不错的 64 个字符,所以我对该文本进行了基准测试,16 次重复(1,024 个字符)和 16,384 次重复(1,048,576 个字符)。 我还包括了更短的文本,重复了 16 个字符的模式<upper> <lower> <digit> <upper> <lower> <digit> <dash>。 每个文本输入都以第一个字符无效、中间字符无效和未修改为基准。 以下是我得到的结果:

// * Summary *
BenchmarkDotNet=v0.11.5, OS=Windows 10.0.17763.437 (1809/October2018Update/Redstone5)
Intel Core i7 CPU 860 2.80GHz (Nehalem), 1 CPU, 8 logical and 4 physical cores
.NET Core SDK=2.1.602
[Host]     : .NET Core 2.1.9 (CoreCLR 4.6.27414.06, CoreFX 4.6.27415.01), 64bit RyuJIT
DefaultJob : .NET Core 2.1.9 (CoreCLR 4.6.27414.06, CoreFX 4.6.27415.01), 64bit RyuJIT

|                Method | TextLength |    TextAdjustment | Ratio |              Mean |           Error |          StdDev |            Median | RatioSD |
|---------------------- |----------- |------------------ |------:|------------------:|----------------:|----------------:|------------------:|--------:|
|         StringIndexOf |         16 |  FirstCharInvalid |  1.00 |         20.347 ns |       0.0294 ns |       0.0261 ns |         20.344 ns |    0.00 |
|         CharOperators |         16 |  FirstCharInvalid |  0.13 |          2.668 ns |       0.0125 ns |       0.0117 ns |          2.669 ns |    0.00 |
|       SelectCaseChars |         16 |  FirstCharInvalid |  0.13 |          2.663 ns |       0.0104 ns |       0.0092 ns |          2.662 ns |    0.00 |
|    SelectCaseIntegers |         16 |  FirstCharInvalid |  0.13 |          2.663 ns |       0.0192 ns |       0.0180 ns |          2.653 ns |    0.00 |
|   CharIsLetterOrDigit |         16 |  FirstCharInvalid |  0.25 |          5.157 ns |       0.0083 ns |       0.0074 ns |          5.155 ns |    0.00 |
|       HashSetContains |         16 |  FirstCharInvalid |  0.59 |         11.987 ns |       0.0766 ns |       0.0679 ns |         11.959 ns |    0.00 |
| RegexMatchInterpreted |         16 |  FirstCharInvalid |  9.86 |        200.588 ns |       1.0530 ns |       0.9849 ns |        200.415 ns |    0.05 |
|    RegexMatchCompiled |         16 |  FirstCharInvalid |  5.90 |        120.033 ns |       0.7760 ns |       0.7258 ns |        119.965 ns |    0.04 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |         16 | MiddleCharInvalid |  1.00 |        113.451 ns |       0.2279 ns |       0.2020 ns |        113.418 ns |    0.00 |
|         CharOperators |         16 | MiddleCharInvalid |  0.20 |         21.836 ns |       0.4759 ns |       1.3265 ns |         20.977 ns |    0.01 |
|       SelectCaseChars |         16 | MiddleCharInvalid |  0.18 |         20.772 ns |       0.0432 ns |       0.0404 ns |         20.766 ns |    0.00 |
|    SelectCaseIntegers |         16 | MiddleCharInvalid |  0.19 |         21.136 ns |       0.7391 ns |       0.7259 ns |         20.788 ns |    0.01 |
|   CharIsLetterOrDigit |         16 | MiddleCharInvalid |  0.39 |         44.811 ns |       0.0992 ns |       0.0928 ns |         44.816 ns |    0.00 |
|       HashSetContains |         16 | MiddleCharInvalid |  1.14 |        129.857 ns |       0.3097 ns |       0.2745 ns |        129.810 ns |    0.00 |
| RegexMatchInterpreted |         16 | MiddleCharInvalid |  5.55 |        629.442 ns |       3.4850 ns |       3.2599 ns |        629.455 ns |    0.03 |
|    RegexMatchCompiled |         16 | MiddleCharInvalid |  3.29 |        372.947 ns |       1.9290 ns |       1.5060 ns |        372.723 ns |    0.01 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |         16 |              None |  1.00 |        246.171 ns |       0.6630 ns |       0.5877 ns |        246.267 ns |    0.00 |
|         CharOperators |         16 |              None |  0.15 |         37.253 ns |       0.0264 ns |       0.0220 ns |         37.250 ns |    0.00 |
|       SelectCaseChars |         16 |              None |  0.15 |         37.317 ns |       0.1025 ns |       0.0959 ns |         37.289 ns |    0.00 |
|    SelectCaseIntegers |         16 |              None |  0.15 |         37.297 ns |       0.1205 ns |       0.1068 ns |         37.265 ns |    0.00 |
|   CharIsLetterOrDigit |         16 |              None |  0.34 |         84.260 ns |       0.5082 ns |       0.4505 ns |         84.219 ns |    0.00 |
|       HashSetContains |         16 |              None |  0.97 |        238.560 ns |       0.5839 ns |       0.5176 ns |        238.724 ns |    0.00 |
| RegexMatchInterpreted |         16 |              None |  2.44 |        599.841 ns |       2.7838 ns |       2.6039 ns |        600.461 ns |    0.01 |
|    RegexMatchCompiled |         16 |              None |  1.97 |        486.225 ns |       3.1438 ns |       2.9407 ns |        486.056 ns |    0.01 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |         64 |  FirstCharInvalid |  1.00 |         20.372 ns |       0.1079 ns |       0.0901 ns |         20.339 ns |    0.00 |
|         CharOperators |         64 |  FirstCharInvalid |  0.13 |          2.662 ns |       0.0190 ns |       0.0169 ns |          2.655 ns |    0.00 |
|       SelectCaseChars |         64 |  FirstCharInvalid |  0.13 |          2.674 ns |       0.0105 ns |       0.0082 ns |          2.674 ns |    0.00 |
|    SelectCaseIntegers |         64 |  FirstCharInvalid |  0.13 |          2.660 ns |       0.0128 ns |       0.0119 ns |          2.657 ns |    0.00 |
|   CharIsLetterOrDigit |         64 |  FirstCharInvalid |  0.25 |          5.145 ns |       0.0058 ns |       0.0055 ns |          5.143 ns |    0.00 |
|       HashSetContains |         64 |  FirstCharInvalid |  0.59 |         11.979 ns |       0.0267 ns |       0.0237 ns |         11.978 ns |    0.00 |
| RegexMatchInterpreted |         64 |  FirstCharInvalid |  9.85 |        200.676 ns |       0.8696 ns |       0.6790 ns |        200.660 ns |    0.05 |
|    RegexMatchCompiled |         64 |  FirstCharInvalid |  5.74 |        117.095 ns |       2.0289 ns |       1.8978 ns |        117.581 ns |    0.09 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |         64 | MiddleCharInvalid |  1.00 |        688.376 ns |       2.1290 ns |       1.7779 ns |        688.436 ns |    0.00 |
|         CharOperators |         64 | MiddleCharInvalid |  0.12 |         83.431 ns |       0.2190 ns |       0.2049 ns |         83.513 ns |    0.00 |
|       SelectCaseChars |         64 | MiddleCharInvalid |  0.12 |         83.191 ns |       0.4731 ns |       0.4426 ns |         83.040 ns |    0.00 |
|    SelectCaseIntegers |         64 | MiddleCharInvalid |  0.12 |         83.101 ns |       0.2071 ns |       0.1836 ns |         83.142 ns |    0.00 |
|   CharIsLetterOrDigit |         64 | MiddleCharInvalid |  0.21 |        141.536 ns |       0.3475 ns |       0.2902 ns |        141.519 ns |    0.00 |
|       HashSetContains |         64 | MiddleCharInvalid |  0.73 |        502.879 ns |       1.6034 ns |       1.4998 ns |        502.753 ns |    0.00 |
| RegexMatchInterpreted |         64 | MiddleCharInvalid |  2.45 |      1,689.088 ns |      10.5955 ns |       9.9110 ns |      1,685.374 ns |    0.02 |
|    RegexMatchCompiled |         64 | MiddleCharInvalid |  1.36 |        940.144 ns |       3.2392 ns |       2.8715 ns |        939.693 ns |    0.01 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |         64 |              None |  1.00 |      1,295.988 ns |      13.4363 ns |      11.9109 ns |      1,296.413 ns |    0.00 |
|         CharOperators |         64 |              None |  0.14 |        180.108 ns |       0.3761 ns |       0.3334 ns |        180.056 ns |    0.00 |
|       SelectCaseChars |         64 |              None |  0.14 |        180.061 ns |       0.4610 ns |       0.4312 ns |        180.022 ns |    0.00 |
|    SelectCaseIntegers |         64 |              None |  0.14 |        180.556 ns |       0.6100 ns |       0.5706 ns |        180.600 ns |    0.00 |
|   CharIsLetterOrDigit |         64 |              None |  0.24 |        316.839 ns |       1.3384 ns |       1.2519 ns |        316.447 ns |    0.00 |
|       HashSetContains |         64 |              None |  0.73 |        948.852 ns |       0.5484 ns |       0.4861 ns |        948.910 ns |    0.01 |
| RegexMatchInterpreted |         64 |              None |  1.08 |      1,399.224 ns |       5.7519 ns |       5.3803 ns |      1,397.276 ns |    0.01 |
|    RegexMatchCompiled |         64 |              None |  0.94 |      1,220.376 ns |       3.9620 ns |       3.7060 ns |      1,219.580 ns |    0.01 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |       1024 |  FirstCharInvalid |  1.00 |         20.303 ns |       0.0379 ns |       0.0316 ns |         20.300 ns |    0.00 |
|         CharOperators |       1024 |  FirstCharInvalid |  0.13 |          2.660 ns |       0.0115 ns |       0.0102 ns |          2.658 ns |    0.00 |
|       SelectCaseChars |       1024 |  FirstCharInvalid |  0.13 |          2.666 ns |       0.0226 ns |       0.0211 ns |          2.657 ns |    0.00 |
|    SelectCaseIntegers |       1024 |  FirstCharInvalid |  0.13 |          2.657 ns |       0.0068 ns |       0.0064 ns |          2.656 ns |    0.00 |
|   CharIsLetterOrDigit |       1024 |  FirstCharInvalid |  0.25 |          5.158 ns |       0.0233 ns |       0.0207 ns |          5.149 ns |    0.00 |
|       HashSetContains |       1024 |  FirstCharInvalid |  0.59 |         11.930 ns |       0.0097 ns |       0.0086 ns |         11.928 ns |    0.00 |
| RegexMatchInterpreted |       1024 |  FirstCharInvalid |  9.92 |        201.375 ns |       1.2479 ns |       1.1672 ns |        201.762 ns |    0.06 |
|    RegexMatchCompiled |       1024 |  FirstCharInvalid |  5.73 |        116.445 ns |       0.9321 ns |       0.8263 ns |        116.386 ns |    0.04 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |       1024 | MiddleCharInvalid |  1.00 |     10,606.137 ns |      99.0400 ns |      92.6421 ns |     10,629.296 ns |    0.00 |
|         CharOperators |       1024 | MiddleCharInvalid |  0.14 |      1,492.585 ns |       3.0048 ns |       2.6637 ns |      1,491.951 ns |    0.00 |
|       SelectCaseChars |       1024 | MiddleCharInvalid |  0.14 |      1,496.829 ns |       5.0131 ns |       4.4440 ns |      1,495.310 ns |    0.00 |
|    SelectCaseIntegers |       1024 | MiddleCharInvalid |  0.14 |      1,496.302 ns |       3.7031 ns |       3.4639 ns |      1,495.270 ns |    0.00 |
|   CharIsLetterOrDigit |       1024 | MiddleCharInvalid |  0.24 |      2,542.033 ns |       7.7978 ns |       6.9125 ns |      2,542.110 ns |    0.00 |
|       HashSetContains |       1024 | MiddleCharInvalid |  0.72 |      7,639.019 ns |      27.6810 ns |      25.8928 ns |      7,635.121 ns |    0.01 |
| RegexMatchInterpreted |       1024 | MiddleCharInvalid |  2.26 |     23,942.868 ns |      83.0869 ns |      73.6543 ns |     23,924.171 ns |    0.02 |
|    RegexMatchCompiled |       1024 | MiddleCharInvalid |  1.25 |     13,233.471 ns |      25.2865 ns |      21.1154 ns |     13,226.030 ns |    0.01 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |       1024 |              None |  1.00 |     21,473.132 ns |     244.6120 ns |     228.8103 ns |     21,413.010 ns |    0.00 |
|         CharOperators |       1024 |              None |  0.14 |      2,967.014 ns |      10.5175 ns |       9.3235 ns |      2,964.323 ns |    0.00 |
|       SelectCaseChars |       1024 |              None |  0.14 |      2,964.892 ns |       6.7969 ns |       6.3578 ns |      2,962.519 ns |    0.00 |
|    SelectCaseIntegers |       1024 |              None |  0.14 |      2,964.403 ns |       4.3433 ns |       3.8502 ns |      2,962.912 ns |    0.00 |
|   CharIsLetterOrDigit |       1024 |              None |  0.24 |      5,066.337 ns |      28.4111 ns |      26.5758 ns |      5,064.938 ns |    0.00 |
|       HashSetContains |       1024 |              None |  0.71 |     15,224.153 ns |      55.8292 ns |      52.2227 ns |     15,199.419 ns |    0.01 |
| RegexMatchInterpreted |       1024 |              None |  0.91 |     19,469.031 ns |      34.3800 ns |      30.4770 ns |     19,471.376 ns |    0.01 |
|    RegexMatchCompiled |       1024 |              None |  0.84 |     17,932.826 ns |      56.9254 ns |      53.2480 ns |     17,905.849 ns |    0.01 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |    1048576 |  FirstCharInvalid |  1.00 |         20.299 ns |       0.0325 ns |       0.0272 ns |         20.303 ns |    0.00 |
|         CharOperators |    1048576 |  FirstCharInvalid |  0.13 |          2.660 ns |       0.0149 ns |       0.0140 ns |          2.654 ns |    0.00 |
|       SelectCaseChars |    1048576 |  FirstCharInvalid |  0.13 |          2.651 ns |       0.0046 ns |       0.0038 ns |          2.650 ns |    0.00 |
|    SelectCaseIntegers |    1048576 |  FirstCharInvalid |  0.13 |          2.658 ns |       0.0054 ns |       0.0048 ns |          2.657 ns |    0.00 |
|   CharIsLetterOrDigit |    1048576 |  FirstCharInvalid |  0.25 |          5.147 ns |       0.0080 ns |       0.0075 ns |          5.148 ns |    0.00 |
|       HashSetContains |    1048576 |  FirstCharInvalid |  0.59 |         11.938 ns |       0.0224 ns |       0.0198 ns |         11.937 ns |    0.00 |
| RegexMatchInterpreted |    1048576 |  FirstCharInvalid |  9.91 |        201.147 ns |       1.1551 ns |       1.0805 ns |        201.236 ns |    0.06 |
|    RegexMatchCompiled |    1048576 |  FirstCharInvalid |  5.69 |        115.428 ns |       0.7162 ns |       0.6349 ns |        115.431 ns |    0.03 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |    1048576 | MiddleCharInvalid |  1.00 | 10,930,247.969 ns | 121,325.8754 ns | 113,488.2991 ns | 10,914,677.344 ns |    0.00 |
|         CharOperators |    1048576 | MiddleCharInvalid |  0.14 |  1,525,854.297 ns |   4,854.7069 ns |   4,303.5712 ns |  1,526,192.285 ns |    0.00 |
|       SelectCaseChars |    1048576 | MiddleCharInvalid |  0.14 |  1,528,810.547 ns |   3,809.7073 ns |   3,563.6026 ns |  1,528,730.664 ns |    0.00 |
|    SelectCaseIntegers |    1048576 | MiddleCharInvalid |  0.14 |  1,527,016.741 ns |   5,582.1223 ns |   4,948.4060 ns |  1,527,480.566 ns |    0.00 |
|   CharIsLetterOrDigit |    1048576 | MiddleCharInvalid |  0.24 |  2,598,319.427 ns |  10,031.3678 ns |   9,383.3477 ns |  2,599,921.875 ns |    0.00 |
|       HashSetContains |    1048576 | MiddleCharInvalid |  0.71 |  7,800,726.302 ns |  46,636.7427 ns |  43,624.0381 ns |  7,778,250.000 ns |    0.01 |
| RegexMatchInterpreted |    1048576 | MiddleCharInvalid |  2.18 | 23,801,565.402 ns |  48,219.3237 ns |  42,745.1745 ns | 23,794,650.000 ns |    0.03 |
|    RegexMatchCompiled |    1048576 | MiddleCharInvalid |  1.23 | 13,425,490.521 ns |  20,217.6784 ns |  18,911.6290 ns | 13,432,740.625 ns |    0.01 |
|                       |            |                   |       |                   |                 |                 |                   |         |
|         StringIndexOf |    1048576 |              None |  1.00 | 21,784,053.125 ns | 188,861.2630 ns | 176,660.9426 ns | 21,774,959.375 ns |    0.00 |
|         CharOperators |    1048576 |              None |  0.14 |  3,038,770.673 ns |   2,091.4132 ns |   1,746.4247 ns |  3,039,308.984 ns |    0.00 |
|       SelectCaseChars |    1048576 |              None |  0.14 |  3,038,094.349 ns |   2,619.8388 ns |   2,450.5989 ns |  3,037,726.172 ns |    0.00 |
|    SelectCaseIntegers |    1048576 |              None |  0.14 |  3,040,074.661 ns |   4,887.7384 ns |   4,571.9936 ns |  3,039,230.859 ns |    0.00 |
|   CharIsLetterOrDigit |    1048576 |              None |  0.24 |  5,217,822.154 ns |  15,537.3365 ns |  13,773.4441 ns |  5,214,110.938 ns |    0.00 |
|       HashSetContains |    1048576 |              None |  0.72 | 15,697,332.478 ns |  19,172.8185 ns |  16,996.2042 ns | 15,692,989.844 ns |    0.01 |
| RegexMatchInterpreted |    1048576 |              None |  0.91 | 19,760,729.241 ns |  71,134.9694 ns |  63,059.2975 ns | 19,732,193.750 ns |    0.01 |
|    RegexMatchCompiled |    1048576 |              None |  0.84 | 18,231,037.500 ns |  48,200.8253 ns |  45,087.0819 ns | 18,222,987.500 ns |    0.01 |
// * Legends *
TextLength     : Value of the 'TextLength' parameter
TextAdjustment : Value of the 'TextAdjustment' parameter
Ratio          : Mean of the ratio distribution ([Current]/[Baseline])
Mean           : Arithmetic mean of all measurements
Error          : Half of 99.9% confidence interval
StdDev         : Standard deviation of all measurements
Median         : Value separating the higher half of all measurements (50th percentile)
RatioSD        : Standard deviation of the ratio distribution ([Current]/[Baseline])
1 ns           : 1 Nanosecond (0.000000001 sec)

我的结果表明,CharOperators()SelectCaseChars()SelectCaseIntegers()的低级方法彼此均匀,并且比其他方法快得多。 对于除最短文本之外的所有文本HashSetContains()都比StringIndexOf()快 30-40%,而RegexMatch*方法充其量仅在需要扫描整个输入的情况下才稍微快 10-15%,在所有情况下都慢得多,尽管使用RegexOptions.Compiled确实对此有所帮助。

我想这表明这里的答案提出的"花哨"类并不是最快的方法。 当然,这一切都有效,因为您的有效字符驻留在连续范围内。 相反,如果您正在检查任意的、分布良好的字符列表,答案很可能有所不同,但为什么不针对您的要求进行优化呢?

Regex 非常适合此类任务。代码很简单,应该很高效。

Dim regex = New Regex("^[0-9A-Za-z-]+$")
Dim isValid = Regex.IsMatch(stringVal)

最新更新