从文本文件中删除重复单词



我有一个文本文件,包含近45000个单词,每行一个单词。成千上万的这样的单词出现了10多次。我想创建一个没有重复单词的新文件。我使用了Stream阅读器,但它只读取一次文件。我怎样才能摆脱重复的单词。请帮帮我。谢谢我的代码是这样的

Try
        File.OpenText(TextBox1.Text)
    Catch ex As Exception
        MsgBox(ex.Message)
        Exit Sub
    End Try
    Dim line As String = String.Empty
    Dim OldLine As String = String.Empty
    Dim sr = File.OpenText(TextBox1.Text)
    line = sr.ReadLine
    OldLine = line
    Do While sr.Peek <> -1
        Application.DoEvents()
        line = sr.ReadLine
        If OldLine <> line Then
                My.Computer.FileSystem.WriteAllText(My.Computer.FileSystem.SpecialDirectories.Desktop & "Splitted File without Repeats.txt", line & vbCrLf, True)
        End If
        OldLine = line
    Loop

    sr.Close()
    System.Diagnostics.Process.Start(My.Computer.FileSystem.SpecialDirectories.Desktop & "Splitted File without Repeats.txt")
    MsgBox("Loop terminated. Stream Reader Closed." & vbCrLf)

您可以使用LINQ的Distinct()方法。

这将适用于较小的文件:

Dim lines As String() = File.ReadAllLines("yourfile.txt")
File.WriteAllLines("yourfile.txt", lines.Distinct().ToArray())

相关内容

  • 没有找到相关文章

最新更新