检查字典中是否存在数组中的所有项目



>我有一些遗留代码,包括:

在启动期间填充了大约 350,000 个项目的Dictionary(dictParts((在运行时不会更改(。dictParts是一个System.Collections.Generic.Dictionary(Of String, System.Data.DataRow).

dictParts中的每个项目都是一个System.Collections.Generic.KeyValuePair(Of String, System.Data.DataRow)

一个经常添加和删除项目的Array(arrOut(,通常在数组中2-6个项目之间(。arrOut是一个仅包含strings 的System.Array

每次数组更改时,我都需要查看是否:

  • 数组中的所有项目都存在于索引中
  • 数组中的某些项存在于索引中

我想,每次数组更改时遍历索引 350,000 都会对性能造成巨大影响,因此需要 LINQ 来提供帮助。

我尝试了以下方法:

Private Sub btnTest_Click(sender As System.Object, e As System.EventArgs) Handles btnTest.Click
Dim dictParts = New Dictionary(Of Integer, String) _
From {{1, "AA-10-100"}, _
{2, "BB-20-100"}, _
{3, "CC-30-100"}, _
{4, "DD-40-100"}, _
{5, "EE-50-100"}}

Dim arrOut() As String = {"AA-10-100", "BB-20-100", "CC-30-100"}
'Tried
Dim allPartsExist As IEnumerable(Of String) = arrOut.ToString.All(dictParts)
'And this
Dim allOfArrayInIndex As Object = arrOut.ToString.Intersect(dictParts).Count() = arrOut.ToString.Count
End Sub

我不断收到错误:无法转换类型为"System.Collections.Generic.Dictionary2[System.Int32,System.String]' to type 'System.Collections.Generic.IEnumerable1[System.Char]"的对象。

请有人告诉我哪里出错了。

为了学习一些东西,我尝试了@emsimpson92建议的哈希集。也许它可以为你工作。

Imports System.Text
Public Class HashSets
Private shortList As New HashSet(Of String)
Private longList As New HashSet(Of String)
Private Sub HashSets_Load(sender As Object, e As EventArgs) Handles MyBase.Load
shortList.Add("AA-10-100")
shortList.Add("BB-20-100")
shortList.Add("DD-40-101")
Dim dictParts As New Dictionary(Of Integer, String) _
From {{1, "AA-10-100"},
{2, "BB-20-100"},
{3, "CC-30-100"},
{4, "DD-40-100"},
{5, "EE-50-100"}}
For Each kv As KeyValuePair(Of Integer, String) In dictParts
longList.Add(kv.Value)
Next
'Two alternative ways to fill the hashset
'1. remove the New from the declaration
'longList = New HashSet(Of String)(dictParts.Values)
'2. Added in Framework 4.7.2
'Enumerable.ToHashSet(Of TSource) Method (IEnumerable(Of TSource))
'longList = dictParts.Values.ToHashSet()
End Sub
Private Sub CompareHashSets()
Debug.Print($"The short list has {shortList.Count} elements")
DisplaySet(shortList)
Debug.Print($"The long list has {longList.Count}")
shortList.ExceptWith(longList)
Debug.Print($"The items missing from the longList {shortList.Count}")
DisplaySet(shortList)
'Immediate Window Results
'The Short list has 3 elements
'{ AA-10-100
'BB-20 - 100
'DD-40 - 101
'}
'The Long list has 5
'The items missing from the longList 1
'{ DD-40-101
'}
End Sub
Private Shared Sub DisplaySet(ByVal coll As HashSet(Of String))
Dim sb As New StringBuilder()
sb.Append("{")
For Each s As String In coll
sb.AppendLine($" {s}")
Next
sb.Append("}")
Debug.Print(sb.ToString)
End Sub
Private Sub btnCompare_Click(sender As Object, e As EventArgs) Handles btnCompare.Click
CompareHashSets()
End Sub
End Class

注意:如果字典中有重复的值(不是重复键,重复值(,则从字典中填充哈希集的代码将不起作用,因为哈希集中的元素必须是唯一的。

运行一个带有HashSet的测试,与包含 350,000 个值的原始Dictionary相比,匹配的项目最后添加到Dictionary中,HashSet速度提高了 15,000 倍以上。

针对原始Dictionary进行测试:

Dim AllInDict = arrOut.All(Function(a) dictParts.ContainsValue(a))
Dim SomeInDict = arrOut.Any(Function(a) dictParts.ContainsValue(a))

HashSet创建确实需要四次Dictionary搜索的时间,因此如果您更改Dictionary的频率高于每四次搜索,这是不值得的。

Dim hs = New HashSet(Of String)(dictParts.Values)

然后,您可以使用HashSet来测试成员资格,这比搜索整个Dictionary至少快 14,000 倍(当然,平均速度约为 50%(。

Dim AllInDict2 = arrOut.All(Function(a) hs.Contains(a))
Dim SomeInDict2 = arrOut.Any(Function(a) hs.Contains(a))

相关内容

  • 没有找到相关文章

最新更新