哈希会耗尽内存,并且随着时间的推移会越来越慢



我有一个GUI桌面应用程序,它为文件和目录生成不同类型的哈希(例如MD5(。最近,当我使用1GB的测试文件进行测试时,我意识到随着时间的推移,它变得越来越慢。在第一次哈希时,1GB的文件大约需要2秒,但后来,对于完全相同的文件,大约需要76秒。

为了演示这个问题,我创建了一个每个人都可以尝试的示例代码(为了重复性(。它有两个关键步骤(1(为文件生成字节数组,(2(为字节数组生成哈希。(在实际程序中,有几个开关和if-else语句,例如决定它是一个文件还是目录…等等,以及涉及到的许多javaFX GUI元素…(

我将展示,即使是这个简化的代码,重复5次也会慢8倍!当我阅读多个论坛时,这可能是内存泄漏或内存消耗过多的原因。。。或者类似的东西。我想要的是,我想在每个周期之间清空内存,所以哈希只需要第一次的时间(2秒(。

上述示例代码如下:

package main;
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.security.MessageDigest;
import java.security.NoSuchAlgorithmException;
import org.springframework.util.StopWatch;
public class Main {
public static void main(String[] args) throws NoSuchAlgorithmException {
// 1GB Test file downloaded from here:
// https://testfiledownload.com/
StopWatch sw = new StopWatch();
// Hash the file 5 times and measure the time for each hashing
for (int i = 0; i < 5; i++) {
sw.start();
String hash = encrypt(inputparser("C:\Users\Thend\Downloads\1GB.bin"));
sw.stop();
// Print execution time at each cycle
System.out.println("Execution time: "+String.format("%.5f", sw.getTotalTimeMillis() / 1000.0f)+"sec"+" Hash: "+hash);
}
}
public static byte[] inputparser(String path){
File f = new File(path);
byte[] bytes = new byte[(int) f.length()];
FileInputStream fis = null;
try {
fis = new FileInputStream(f);
// read file into bytes[]
fis.read(bytes);
if (fis != null) {
fis.close();
}
} catch (IOException e) {
e.printStackTrace();
}
return bytes;
}
public static String encrypt(byte[] bytes) throws NoSuchAlgorithmException {
MessageDigest md = MessageDigest.getInstance("MD5");
StringBuilder sb = new StringBuilder();
md.reset();
md.update(bytes);
byte[] hashed_bytes = md.digest();
// Convert bytes[] (in decimal format) to hexadecimal
for (int i = 0; i < hashed_bytes.length; i++) {
sb.append(Integer.toString((hashed_bytes[i] & 0xff) + 0x100, 16).substring(1));
}
// Return hashed String in hex format
String hashedByteArray = sb.toString();
return hashedByteArray;
}
}

控制台输出,在那里你可以看到增加的时间:

"C:Program FilesJavajdk-18.0.2binjava.exe" "-javaagent:C:Program FilesJetBrainsIntelliJ IDEA Community Edition 2022.2libidea_rt.jar=54323:C:Program FilesJetBrainsIntelliJ IDEA Community Edition 2022.2bin" -Dfile.encoding=UTF-8 -Dsun.stdout.encoding=UTF-8 -Dsun.stderr.encoding=UTF-8 -classpath C:UsersThendintellij-workspaceHashTimeTestoutproductionHashTimeTest;C:UsersThendDownloadsspringframework-5.1.0.jar main.Main
Execution time: 2,04100sec Hash: e5c834fbdaa6bfd8eac5eb9404eefdd4
Execution time: 3,70900sec Hash: e5c834fbdaa6bfd8eac5eb9404eefdd4
Execution time: 5,42100sec Hash: e5c834fbdaa6bfd8eac5eb9404eefdd4
Execution time: 7,09600sec Hash: e5c834fbdaa6bfd8eac5eb9404eefdd4
Execution time: 8,75500sec Hash: e5c834fbdaa6bfd8eac5eb9404eefdd4
Process finished with exit code 0

您的问题可能是将对加载文件的引用存储在某个位置。

对大文件进行哈希计算这样的操作的更好方法是不要将所有内容加载到内存中,而是只逐块加载:

public static String encrypt(String path) throws NoSuchAlgorithmException {
File f = new File(path);
byte[] bytes = new byte[4096];
MessageDigest md = MessageDigest.getInstance("MD5");
try (FileInputStream fis = new FileInputStream(f)) {
while (true) {
int len = fis.read(bytes);
if (len == -1) {
break;
}
md.update(bytes, 0, len);
}
} catch (IOException e) {
e.printStackTrace();
}
byte[] hashed_bytes = md.digest();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hashed_bytes.length; i++) {
sb.append(String.format("%02x", hashed_bytes[i]));
}
return sb.toString();
}

最新更新