VB.. NET -在内存中下载zip并从内存中提取文件到磁盘



我在这方面遇到了一些麻烦,尽管找到了例子。我想可能是编码问题,但我不确定。我正试图从使用cookie的https服务器程序下载文件(因此我使用httpwebrequest)。我正在调试打印要检查的流的容量,但输出[原始]文件看起来不同。已尝试其他编码无效。

代码:

    Sub downloadzip(strURL As String, strDestDir As String)
    Dim request As HttpWebRequest
    Dim response As HttpWebResponse
    request = Net.HttpWebRequest.Create(strURL)
    request.UserAgent = strUserAgent
    request.Method = "GET"
    request.CookieContainer = cookieJar
    response = request.GetResponse()
    If response.ContentType = "application/zip" Then
        Debug.WriteLine("Is Zip")
    Else
        Debug.WriteLine("Is NOT Zip: is " + response.ContentType.ToString)
        Exit Sub
    End If
    Dim intLen As Int64 = response.ContentLength
    Debug.WriteLine("response length: " + intLen.ToString)
    Using srStreamRemote As StreamReader = New StreamReader(response.GetResponseStream(), Encoding.Default)
        'Using ms As New MemoryStream(intLen)
        Dim fullfile As String = srStreamRemote.ReadToEnd
        Dim memstream As MemoryStream = New MemoryStream(New UnicodeEncoding().GetBytes(fullfile))
        'test write out to flie
        Dim data As Byte() = memstream.ToArray()
        Using filestrm As FileStream = New FileStream("c:tempdebug.zip", FileMode.Create)
            filestrm.Write(data, 0, data.Length)
        End Using
        Debug.WriteLine("Memstream capacity " + memstream.Capacity.ToString)
        'Dim strData As String = srStreamRemote.ReadToEnd
        memstream.Seek(0, 0)
        Dim buffer As Byte() = New Byte(2048) {}
        Using zip As New ZipInputStream(memstream)
            Debug.WriteLine("zip stream cap " + zip.Length.ToString)
            zip.Seek(0, 0)
            Dim e As ZipEntry
            Dim flag As Boolean = True
            Do While flag ' daft, but won't assign e=zip... tries to evaluate
                e = zip.GetNextEntry
                If IsNothing(e) Then
                    flag = False
                    Exit Do
                Else
                    e.UseUnicodeAsNecessary = True
                End If
                If Not e.IsDirectory Then
                    Debug.WriteLine("Writing out " + e.FileName)
                    '    e.Extract(strDestDir)
                    Using output As FileStream = File.Open(Path.Combine(strDestDir, e.FileName), _
                                                          FileMode.Create, FileAccess.ReadWrite)
                        Dim n As Integer
                        Do While (n = zip.Read(buffer, 0, buffer.Length) > 0)
                            output.Write(buffer, 0, n)
                        Loop
                    End Using
                End If
            Loop
        End Using
        'End Using
    End Using 'srStreamRemote.Close()
    response.Close()
End Sub

所以我下载了正确大小的文件,但是dotnetzip不能识别它,并且复制出来的文件是不完整/无效的zip。我今天花了大部分时间在这上面,我准备放弃了。

我认为答案是分解问题,也许在代码中改变几个方面。

例如,让我们放弃将响应流转换为字符串:

Dim memStream As MemoryStream
Using rdr As System.IO.Stream = response.GetResponseStream
    Dim count = Convert.ToInt32(response.ContentLength)
    Dim buffer = New Byte(count) {}
    Dim bytesRead As Integer
    Do
        bytesRead += rdr.Read(buffer, bytesRead, count - bytesRead)
    Loop Until bytesRead = count
    rdr.Close()
    memStream = New MemoryStream(buffer)
End Using

接下来,有一种更简单的方法将内存流的内容输出到文件。考虑你的代码

Dim data As Byte() = memstream.ToArray()
Using filestrm As FileStream = New FileStream("c:tempdebug.zip", FileMode.Create)
    filestrm.Write(data, 0, data.Length)
End Using

可以用

代替
Using filestrm As FileStream = New FileStream("c:tempdebug.zip", FileMode.Create)
    memstream.WriteTo(filestrm)
End Using

这消除了将内存流传输到另一个字节数组,然后将字节数组推下流的需要,而实际上内存流可以直接将数据传输到文件(通过文件流),从而节省中间人缓冲区。

我承认我没有使用过您正在使用的Zip/压缩库,但是通过上述修改,您已经删除了流,字节数组,字符串等之间不必要的传输,并且有望消除您所遇到的编码问题。

试一试,让我们知道你的进展如何。考虑尝试打开您保存的文件("C:tempdebug.zip"),看看它是否被列为损坏。如果没有,那么您知道至少在代码中,它工作正常。

我想我应该把我自己的问题的完整工作解决方案贴出来,它结合了我得到的两个优秀的回答,谢谢你们。

Sub downloadzip(strURL As String, strDestDir As String)
    Try
        Dim request As HttpWebRequest
        Dim response As HttpWebResponse
        request = Net.HttpWebRequest.Create(strURL)
        request.UserAgent = strUserAgent
        request.Method = "GET"
        request.CookieContainer = cookieJar
        response = request.GetResponse()
        If response.ContentType = "application/zip" Then
            Debug.WriteLine("Is Zip")
        Else
            Debug.WriteLine("Is NOT Zip: is " + response.ContentType.ToString)
            Exit Sub
        End If
        Dim intLen As Int32 = response.ContentLength
        Debug.WriteLine("response length: " + intLen.ToString)
        Dim memStream As MemoryStream
        Using stmResponse As IO.Stream = response.GetResponseStream()
            'Using ms As New MemoryStream(intLen)
            Dim buffer = New Byte(intLen) {}
            'Dim memstream As MemoryStream = New MemoryStream(buffer)
            Dim bytesRead As Integer
            Do
                bytesRead += stmResponse.Read(buffer, bytesRead, intLen - bytesRead)
            Loop Until bytesRead = intLen
            memStream = New MemoryStream(buffer)
            Dim res As Boolean = False
            res = ZipExtracttoFile(memStream, strDestDir)
        End Using 'srStreamRemote.Close()
        response.Close()

    Catch ex As Exception
        'to do :)
    End Try
End Sub

Function ZipExtracttoFile(strm As MemoryStream, strDestDir As String) As Boolean
    Try
        Using zip As ZipFile = ZipFile.Read(strm)
            For Each e As ZipEntry In zip
                e.Extract(strDestDir)
            Next
        End Using
    Catch ex As Exception
        Return False
    End Try
    Return True
End Function

你可以下载到MemoryStream,然后检查它:

Public Sub Download(url as String)
    Dim req As HttpWebRequest = System.Net.WebRequest.Create(url)
    req.Method = "GET"
    Dim resp As HttpWebResponse = req.GetResponse()
    If resp.ContentType = "application/zip" Then
        Console.Error.Write("The result is a zip file.")
        Dim length As Int64 = resp.ContentLength
        If length = -1 Then
            Console.Error.WriteLine("... length unspecified")
            length = 16 * 1024
        Else
            Console.Error.WriteLine("... has length {0}", length)
        End If
        Dim ms As New MemoryStream
        CopyStream(resp.GetResponseStream(), ms)  '' **see note below!!!!
        '' list contents of the zip file
        ms.Seek(0,SeekOrigin.Begin)
        Using zip As ZipFile = ZipFile.Read (ms)
            Dim e As ZipEntry
            Console.Error.WriteLine("Entries:")
            Console.Error.WriteLine("  {0,22}  {1,10}  {2,12}", _
                                    "Name", "compressed", "uncompressed")
            Console.Error.WriteLine("----------------------------------------------------")
            For Each e In zip
                Console.Error.WriteLine("  {0,22}  {1,10}  {2,12}", _
                                        e.FileName, _
                                        e.CompressedSize, _
                                        e.UncompressedSize)
            Next
        End Using
    Else
        Console.Error.WriteLine("The result is Not a zip file.")
        CopyStream(resp.GetResponseStream(), Console.OpenStandardOutput)
    End If
End Sub

Private Shared Sub CopyStream(input As Stream, output As Stream)
    Dim buffer(32768 - 1) As Byte
    Dim n As Int32
    Do
        n = input.Read(buffer, 0, buffer.Length)
        If n = 0 Then Exit Do
            output.Write(buffer, 0, n)
    Loop
End Sub

编辑

只有一个注意事项-如果Zip文件非常大,我不建议使用此代码(这种方法)。多大是"非常大"?当然,这要视情况而定。上面我建议的代码将文件下载到内存流中,这当然意味着zip文件的全部内容都保存在内存中。如果它是一个28kb的zip文件,那么就没有问题。但如果它是一个2gb的zip文件,那么你可能有一个大问题。

在这种情况下,您将希望将其流式传输到磁盘上的临时文件,而不是MemoryStream。我将把它留给读者作为练习。

以上将适用于"合理大小"的zip文件,其中"合理"取决于您的机器配置和应用程序场景。

最新更新