>我有一些遗留代码,包括:
在启动期间填充了大约 350,000 个项目的Dictionary
(dictParts
((在运行时不会更改(。dictParts
是一个System.Collections.Generic.Dictionary(Of String, System.Data.DataRow)
.
dictParts
中的每个项目都是一个System.Collections.Generic.KeyValuePair(Of String, System.Data.DataRow)
。
一个经常添加和删除项目的Array
(arrOut
(,通常在数组中2-6个项目之间(。arrOut
是一个仅包含string
s 的System.Array
。
每次数组更改时,我都需要查看是否:
- 数组中的所有项目都存在于索引中
- 数组中的某些项存在于索引中
我想,每次数组更改时遍历索引 350,000 都会对性能造成巨大影响,因此需要 LINQ 来提供帮助。
我尝试了以下方法:
Private Sub btnTest_Click(sender As System.Object, e As System.EventArgs) Handles btnTest.Click
Dim dictParts = New Dictionary(Of Integer, String) _
From {{1, "AA-10-100"}, _
{2, "BB-20-100"}, _
{3, "CC-30-100"}, _
{4, "DD-40-100"}, _
{5, "EE-50-100"}}
Dim arrOut() As String = {"AA-10-100", "BB-20-100", "CC-30-100"}
'Tried
Dim allPartsExist As IEnumerable(Of String) = arrOut.ToString.All(dictParts)
'And this
Dim allOfArrayInIndex As Object = arrOut.ToString.Intersect(dictParts).Count() = arrOut.ToString.Count
End Sub
我不断收到错误:无法转换类型为"System.Collections.Generic.Dictionary2[System.Int32,System.String]' to type 'System.Collections.Generic.IEnumerable
1[System.Char]"的对象。
请有人告诉我哪里出错了。
为了学习一些东西,我尝试了@emsimpson92建议的哈希集。也许它可以为你工作。
Imports System.Text
Public Class HashSets
Private shortList As New HashSet(Of String)
Private longList As New HashSet(Of String)
Private Sub HashSets_Load(sender As Object, e As EventArgs) Handles MyBase.Load
shortList.Add("AA-10-100")
shortList.Add("BB-20-100")
shortList.Add("DD-40-101")
Dim dictParts As New Dictionary(Of Integer, String) _
From {{1, "AA-10-100"},
{2, "BB-20-100"},
{3, "CC-30-100"},
{4, "DD-40-100"},
{5, "EE-50-100"}}
For Each kv As KeyValuePair(Of Integer, String) In dictParts
longList.Add(kv.Value)
Next
'Two alternative ways to fill the hashset
'1. remove the New from the declaration
'longList = New HashSet(Of String)(dictParts.Values)
'2. Added in Framework 4.7.2
'Enumerable.ToHashSet(Of TSource) Method (IEnumerable(Of TSource))
'longList = dictParts.Values.ToHashSet()
End Sub
Private Sub CompareHashSets()
Debug.Print($"The short list has {shortList.Count} elements")
DisplaySet(shortList)
Debug.Print($"The long list has {longList.Count}")
shortList.ExceptWith(longList)
Debug.Print($"The items missing from the longList {shortList.Count}")
DisplaySet(shortList)
'Immediate Window Results
'The Short list has 3 elements
'{ AA-10-100
'BB-20 - 100
'DD-40 - 101
'}
'The Long list has 5
'The items missing from the longList 1
'{ DD-40-101
'}
End Sub
Private Shared Sub DisplaySet(ByVal coll As HashSet(Of String))
Dim sb As New StringBuilder()
sb.Append("{")
For Each s As String In coll
sb.AppendLine($" {s}")
Next
sb.Append("}")
Debug.Print(sb.ToString)
End Sub
Private Sub btnCompare_Click(sender As Object, e As EventArgs) Handles btnCompare.Click
CompareHashSets()
End Sub
End Class
注意:如果字典中有重复的值(不是重复键,重复值(,则从字典中填充哈希集的代码将不起作用,因为哈希集中的元素必须是唯一的。
运行一个带有HashSet
的测试,与包含 350,000 个值的原始Dictionary
相比,匹配的项目最后添加到Dictionary
中,HashSet
速度提高了 15,000 倍以上。
针对原始Dictionary
进行测试:
Dim AllInDict = arrOut.All(Function(a) dictParts.ContainsValue(a))
Dim SomeInDict = arrOut.Any(Function(a) dictParts.ContainsValue(a))
HashSet
创建确实需要四次Dictionary
搜索的时间,因此如果您更改Dictionary
的频率高于每四次搜索,这是不值得的。
Dim hs = New HashSet(Of String)(dictParts.Values)
然后,您可以使用HashSet
来测试成员资格,这比搜索整个Dictionary
至少快 14,000 倍(当然,平均速度约为 50%(。
Dim AllInDict2 = arrOut.All(Function(a) hs.Contains(a))
Dim SomeInDict2 = arrOut.Any(Function(a) hs.Contains(a))