如何通过Powershell从在线ZIP档案中只下载一个文件



我只想通过Powershell从在线ZIP存档中下载一个文件。为此,我创建了一个已经可以使用的演示代码,但我仍在努力在ZIP目录中获得正确的解析逻辑。这是我迄今为止的代码:

# demo code downloading a single DLL file from an online ZIP archive
# and extracting the DLL into memory for mounting it to the main process.
cls
Remove-Variable * -ea 0
# definition for the ZIP archive, the file to be extracted and the checksum:
$url = 'https://github.com/sshnet/SSH.NET/releases/download/2020.0.1/SSH.NET-2020.0.1-bin.zip'
$sub = 'net40/Renci.SshNet.dll'
$md5 = '5B1AF51340F333CD8A49376B13AFCF9C'
# prepare HTTP client:
Add-Type -AssemblyName System.Net.Http
$handler = [System.Net.Http.HttpClientHandler]::new()
$client  = [System.Net.Http.HttpClient]::new($handler)
# get the length of the ZIP archive:
$req = [System.Net.HttpWebRequest]::Create($url)
$req.Method = 'HEAD'
$length = $req.GetResponse().ContentLength
$zip = [byte[]]::new($length)
# get the last 10k:
# how to get the correct length of the central ZIP directory here?
$start = $length-10kb
$end   = $length-1
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$last10kb = $result.content.ReadAsByteArrayAsync().Result
$last10kb.CopyTo($zip, $start)
# get the block containing the DLL file:
# how to get the exact file-offset from the ZIP directory?
$start = $length-3537kb
$end   = $length-3201kb
$client.DefaultRequestHeaders.Clear()
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$block = $result.content.ReadAsByteArrayAsync().Result
$block.CopyTo($zip, $start)
# extract the DLL file from archive:
Add-Type -AssemblyName System.IO.Compression
$stream = [System.IO.Memorystream]::new()
$stream.Write($zip,0,$zip.Length)
$archive = [System.IO.Compression.ZipArchive]::new($stream)
$entry = $archive.GetEntry($sub)
$bytes = [byte[]]::new($entry.Length)
[void]$entry.Open().Read($bytes, 0, $bytes.Length)
# check MD5:
$prov = [Security.Cryptography.MD5CryptoServiceProvider]::new().ComputeHash($bytes)
$hash = [string]::Concat($prov.foreach{$_.ToString("x2")})
if ($hash -ne $md5) {write-host 'dll has wrong checksum.' -f y ;break}
# load the DLL:
[void][System.Reflection.Assembly]::Load($bytes)
# use the single demo-call from the DLL:
$test = [Renci.SshNet.NoneAuthenticationMethod]::new('test')
'done.'

这段代码中唯一的开放点是识别ZIP存档末尾的中心目录长度的正确方法,以及如何为要提取的单个文件获得正确的文件偏移量(在我的代码中,我只是通过纯粹的尝试和错误找到了范围(。

我已经查看了此wikihttps://en.wikipedia.org/wiki/ZIP_(file_format(#结构以及PKWARE定义https://gist.github.com/steakknife/820b73ebf25146180198febdb6f0e183但除了块定义之外,我找不到一种编程方法来获得EOCD和单个文件的偏移量。有人能帮忙吗?

经过几次额外的测试,我得到了这个解决方案:

# demo code downloading a single DLL file from an online ZIP archive
# and extracting the DLL into memory to mount it finally to the main process.
cls
Remove-Variable * -ea 0
# definition for the ZIP archive, the file to be extracted and the checksum:
$url = 'https://github.com/sshnet/SSH.NET/releases/download/2020.0.1/SSH.NET-2020.0.1-bin.zip'
$sub = 'net40/Renci.SshNet.dll'
$md5 = '5B1AF51340F333CD8A49376B13AFCF9C'
'prepare HTTP client:'
Add-Type -AssemblyName System.Net.Http
$handler = [System.Net.Http.HttpClientHandler]::new()
$client  = [System.Net.Http.HttpClient]::new($handler)
'get the length of the ZIP archive:'
# dont use System.Web.HttpRequest, it is frequently hanging:
$req = [System.Net.Http.HttpRequestMessage]::new('HEAD', $url)
$result = $client.SendAsync($req).Result
$zipLength = $result.Content.Headers.ContentLength
$zip = [byte[]]::new($zipLength)
$req.Dispose()
'get the last 10k:'
$start = $zipLength-10kb
$end   = $zipLength-1
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$last10kb = $result.content.ReadAsByteArrayAsync().Result
$last10kb.CopyTo($zip, $start)
"get the 'End of CD' block:"
$enc = [System.Text.Encoding]::GetEncoding(28591)
$end = $enc.GetString($last10kb, $last10kb.Length-256, 256)
$eocd = [regex]::Match($end, 'PKx05x06.*').value
$eocd = $enc.GetBytes($eocd)
'get the central directory:'
$cdLength = [bitconverter]::ToUInt32($eocd, 12)
$cdStart  = [bitconverter]::ToUInt32($eocd, 16)
$cd = [byte[]]::new($cdLength)
[array]::Copy($zip, $cdStart, $cd, 0, $cdLength)
'search all file headers for correct file name:'
$fileHeaders = [regex]::Split($enc.GetString($cd),'PKx01x02')
foreach ($header in $fileHeaders) {
$len = $header.Length
if ($len -ge 42) {
$bytes = $enc.GetBytes($header)
$nameLength = [bitconverter]::ToUInt16($bytes, 24)
if ($nameLength -eq $sub.length -and ($nameLength + 42) -le $len) { 
$name = $header.Substring(42, $nameLength)
if ($name -eq $sub) {
$size   = [bitconverter]::ToUInt32($bytes, 16) + 256
$start  = [bitconverter]::ToUInt32($bytes, 38)
break
}
}
}
}
if (!$start) {write-host 'we could not find file in the ZIP archive' -f y ;break}
'get the block containing the file:'
$end   = $start+$size
$client.DefaultRequestHeaders.Clear()
$client.DefaultRequestHeaders.Add('Range', "bytes=$start-$end")
$result = $client.GetAsync($url).Result
$block = $result.content.ReadAsByteArrayAsync().Result
$block.CopyTo($zip, $start)
$client.dispose()
'extract the DLL file from archive:'
Add-Type -AssemblyName System.IO.Compression
$stream = [System.IO.Memorystream]::new()
$stream.Write($zip,0,$zip.Length)
$archive = [System.IO.Compression.ZipArchive]::new($stream)
$entry = $archive.GetEntry($sub)
$bytes = [byte[]]::new($entry.Length)
[void]$entry.Open().Read($bytes, 0, $bytes.Length)
'check MD5:'
$prov = [Security.Cryptography.MD5CryptoServiceProvider]::new().ComputeHash($bytes)
$hash = [string]::Concat($prov.foreach{$_.ToString("x2")})
if ($hash -ne $md5) {write-host 'dll has wrong checksum.' -f y ;break}
'load the DLL:'
[void][System.Reflection.Assembly]::Load($bytes)
'use the single demo-call from the DLL:'
$test = [Renci.SshNet.NoneAuthenticationMethod]::new('test')
'done.'

最新更新