PowerShell 7.0如何计算大块读取的大文件的哈希和



脚本应该复制文件并计算它们的哈希和。我的目标是制作一个函数,该函数将读取文件一次,而不是3次(read_for_copy+read_for_hash+read_foer_another_copy(,以最大限度地减少网络负载。所以我试着读取一大块文件,然后计算md5哈希和,并将文件写到几个地方。文件的大小可能从100MB到2TB不等,甚至更多。此时不需要检查文件身份,只需要计算初始文件的哈希和即可。

我一直在计算散列和:

$ifile = "C:UsersUserDesktopinputfile"
$ofile = "C:UsersUserDesktopoutputfile_1"
$ofile2 = "C:UsersUserDesktopoutputfile_2"

$md5 = new-object -TypeName System.Security.Cryptography.MD5CryptoServiceProvider
$bufferSize = 10mb
$stream = [System.IO.File]::OpenRead($ifile)
$makenew = [System.IO.File]::OpenWrite($ofile)
$makenew2 = [System.IO.File]::OpenWrite($ofile2)
$buffer = new-object Byte[] $bufferSize

while ( $stream.Position -lt $stream.Length ) {

$bytesRead = $stream.Read($buffer, 0, $bufferSize)
$makenew.Write($buffer, 0, $bytesread) 
$makenew2.Write($buffer, 0, $bytesread) 

# I am stuck here
$hash = [System.BitConverter]::ToString($md5.ComputeHash($buffer)) -replace "-",""      

}

$stream.Close()
$makenew.Close()
$makenew2.Close()

如何收集数据块来计算整个文件的哈希?

还有一个额外的问题:有可能以并行模式计算哈希并写出数据吗?特别是考虑到PS版本6不支持workflow {parallel{}}

非常感谢

如果要手动处理输入缓冲,则需要使用$md5:公开的TransformBlock/TransformFinalBlock方法

while($bytesRead = $stream.Read($buffer, 0, $bufferSize))
{
# Write to file copies
$makenew.Write($buffer, 0, $bytesread) 
$makenew2.Write($buffer, 0, $bytesread)
# Feed next chunk to MD5 CSP
$null = $md5.TransformBlock($buffer, 0 , $bytesRead, $null, 0)
}
# Complete the hashing routine
$md5.TransformFinalBlock([byte[]]::new(0), 0, 0)
# Grab hash value from CSP
$hash = [BitConverter]::ToString($md5.Hash).Replace('-','')

我的目标是制作一个函数,该函数将读取文件一次,而不是3次(read_for_copy+read_for_hash+read_foer_another_copy(,以最小化网络负载

我不完全确定你所说的网络负载是什么意思。如果源文件在远程文件共享上,但新副本进入本地文件系统,则只需复制一次源文件,然后将该副本用作第二个副本的源,并进行哈希计算,即可最大限度地减少网络负载:

$ifile = "\remoteMachinec$UsersUserDesktopinputfile"
$ofile = "C:UsersUserDesktopoutputfile_1"
$ofile2 = "C:UsersUserDesktopoutputfile_2"

# Copy remote -> local
Copy-Item -Path $ifile -Destination $ofile
# Copy local -> local
Copy-Item -Path $ofile -Destination $ofile2
# Hash local file stream
$md5 = New-Object -TypeName System.Security.Cryptography.MD5CryptoServiceProvider
$stream = [System.IO.File]::OpenRead($ofile)
$hash = [BitConverter]::ToString($md5.ComputeHash($stream)).Replace('-','')

FWIW,直接将文件流对象传递给$md5.ComputeHash($stream)可能比手动缓冲输入更快

最终上市

$ifile = "C:UsersUserDesktopinputfile"
$ofile = "C:UsersUserDesktopoutputfile_1"
$ofile2 = "C:UsersUserDesktopoutputfile_2"
$md5 = new-object -TypeName System.Security.Cryptography.MD5CryptoServiceProvider
$bufferSize = 1mb
$stream = [System.IO.File]::OpenRead($ifile)
$makenew = [System.IO.File]::OpenWrite($ofile)
$makenew2 = [System.IO.File]::OpenWrite($ofile2)
$buffer = new-object Byte[] $bufferSize
while ( $stream.Position -lt $stream.Length ) 
{
$bytesRead = $stream.Read($buffer, 0, $bufferSize)
$makenew.Write($buffer, 0, $bytesread) 
$makenew2.Write($buffer, 0, $bytesread) 

$hash = $md5.TransformBlock($buffer, 0 , $bytesRead, $null , 0)  
} 
$md5.TransformFinalBlock([byte[]]::new(0), 0, 0)
$hash = [BitConverter]::ToString($md5.Hash).Replace('-','')      
$hash
$stream.Flush()
$stream.Close()
$makenew.Flush()
$makenew.Close()
$makenew2.Flush()
$makenew2.Close()

最新更新