Java:高效计算大文件的 SHA-256 哈希

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1741545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 22:08:15  来源:igfitidea点击:

Java: Calculate SHA-256 hash of large file efficiently

javaoptimizationhashperformancesha256

提问by stefita

I need to calculate a SHA-256 hash of a large file (or portion of it). My implementation works fine, but its much slower than the C++'s CryptoPP calculation (25 Min. vs. 10 Min for ~30GB file). What I need is a similar execution time in C++ and Java, so the hashes are ready at almost the same time. I also tried the Bouncy Castle implementation, but it gave me the same result. Here is how I calculate the hash:

我需要计算一个大文件(或其中的一部分)的 SHA-256 哈希值。我的实现工作正常,但它比 C++ 的 CryptoPP 计算慢得多(对于 ~30GB 文件,25 分钟对 10 分钟)。我需要的是 C++ 和 Java 中类似的执行时间,因此哈希值几乎同时准备就绪。我也尝试了 Bouncy Castle 实现,但它给了我相同的结果。这是我计算哈希的方法:

int buff = 16384;
try {
    RandomAccessFile file = new RandomAccessFile("T:\someLargeFile.m2v", "r");

    long startTime = System.nanoTime();
    MessageDigest hashSum = MessageDigest.getInstance("SHA-256");

    byte[] buffer = new byte[buff];
    byte[] partialHash = null;

    long read = 0;

    // calculate the hash of the hole file for the test
    long offset = file.length();
    int unitsize;
    while (read < offset) {
        unitsize = (int) (((offset - read) >= buff) ? buff : (offset - read));
        file.read(buffer, 0, unitsize);

        hashSum.update(buffer, 0, unitsize);

        read += unitsize;
    }

    file.close();
    partialHash = new byte[hashSum.getDigestLength()];
    partialHash = hashSum.digest();

    long endTime = System.nanoTime();

    System.out.println(endTime - startTime);

} catch (FileNotFoundException e) {
    e.printStackTrace();
}

采纳答案by jarnbjo

My explanation may not solve your problem since it depends a lot on your actual runtime environment, but when I run your code on my system, the throughput is limited by disk I/O and not the hash calculation. The problem is not solved by switching to NIO, but is simply caused by the fact that you're reading the file in very small pieces (16kB). Increasing the buffer size (buff) on my system to 1MB instead of 16kB more than doubles the throughput, but with >50MB/s, I am still limited by disk speed and not able to fully load a single CPU core.

我的解释可能无法解决您的问题,因为它在很大程度上取决于您的实际运行时环境,但是当我在系统上运行您的代码时,吞吐量受到磁盘 I/O 而非哈希计算的限制。切换到 NIO 并不能解决该问题,而仅仅是因为您正在读取非常小的文件 (16kB)。将我系统上的缓冲区大小 (buff) 增加到 1MB 而不是 16kB 可以使吞吐量增加一倍以上,但在大于 50MB/s 的情况下,我仍然受到磁盘速度的限制,无法完全加载单个 CPU 内核。

BTW: You can simplify your implementation a lot by wrapping a DigestInputStream around a FileInputStream, read through the file and get the calculated hash from the DigestInputStream instead of manually shuffling the data from a RandomAccessFile to the MessageDigest as in your code.

顺便说一句:您可以通过将 DigestInputStream 包装在 FileInputStream 周围,读取文件并从 DigestInputStream 获取计算的哈希值来大大简化您的实现,而不是像在您的代码中那样手动将数据从 RandomAccessFile 混洗到 MessageDigest。



I did a few performance tests with older Java versions and there seem to be a relevant difference between Java 5 and Java 6 here. I'm not sure though if the SHA implementation is optimized or if the VM is executing the code much faster. The throughputs I get with the different Java versions (1MB buffer) are:

我用旧的 Java 版本做了一些性能测试,这里的 Java 5 和 Java 6 之间似乎存在相关差异。我不确定 SHA 实现是否已优化,或者 VM 是否执行代码的速度要快得多。我使用不同 Java 版本(1MB 缓冲区)获得的吞吐量是:

  • Sun JDK 1.5.0_15 (client): 28MB/s, limited by CPU
  • Sun JDK 1.5.0_15 (server): 45MB/s, limited by CPU
  • Sun JDK 1.6.0_16 (client): 42MB/s, limited by CPU
  • Sun JDK 1.6.0_16 (server): 52MB/s, limited by disk I/O (85-90% CPU load)
  • Sun JDK 1.5.0_15(客户端):28MB/s,受 CPU 限制
  • Sun JDK 1.5.0_15(服务器):45MB/s,受 CPU 限制
  • Sun JDK 1.6.0_16(客户端):42MB/s,受 CPU 限制
  • Sun JDK 1.6.0_16(服务器):52MB/s,受磁盘 I/O 限制(85-90% CPU 负载)


I was a little bit curious on the impact of the assembler part in the CryptoPP SHA implementation, as the benchmarks resultsindicate that the SHA-256 algorithm only requires 15.8 CPU cycles/byte on an Opteron. I was unfortunately not able to build CryptoPP with gcc on cygwin (the build succeeded, but the generated exe failed immediately), but building a performance benchmark with VS2005 (default release configuration) with and without assembler support in CryptoPP and comparing to the Java SHA implementation on an in-memory buffer, leaving out any disk I/O, I get the following results on a 2.5GHz Phenom:

我对 CryptoPP SHA 实现中汇编器部分的影响有点好奇,因为基准测试结果表明 SHA-256 算法在 Opteron 上只需要 15.8 个 CPU 周期/字节。不幸的是,我无法在 cygwin 上使用 gcc 构建 CryptoPP(构建成功,但生成的 exe 立即失败),但是使用 VS2005(默认发布配置)构建了一个性能基准测试,无论是否支持 CryptoPP 中的汇编程序支持,并与 Java SHA 进行比较在内存缓冲区上实现,不考虑任何磁盘 I/O,我在 2.5GHz Phenom 上得到以下结果:

  • Sun JDK1.6.0_13 (server): 26.2 cycles/byte
  • CryptoPP (C++ only): 21.8 cycles/byte
  • CryptoPP (assembler): 13.3 cycles/byte
  • Sun JDK1.6.0_13(服务器):26.2 周期/字节
  • CryptoPP(仅限 C++):21.8 个周期/字节
  • CryptoPP(汇编器):13.3 个周期/字节

Both benchmarks compute the SHA hash of a 4GB empty byte array, iterating over it in chunks of 1MB, which are passed to MessageDigest#update (Java) or CryptoPP's SHA256.Update function (C++).

两个基准测试都计算 4GB 空字节数组的 SHA 哈希值,以 1MB 的块对其进行迭代,然后将其传递给 MessageDigest#update (Java) 或 CryptoPP 的 SHA256.Update 函数 (C++)。

I was able to build and benchmark CryptoPP with gcc 4.4.1 (-O3) in a virtual machine running Linux and got only appr. half the throughput compared to the results from the VS exe. I am not sure how much of the difference is contributed to the virtual machine and how much is caused by VS usually producing better code than gcc, but I have no way to get any more exact results from gcc right now.

我能够在运行 Linux 的虚拟机中使用 gcc 4.4.1 (-O3) 构建和基准测试 CryptoPP 并且只获得了 appr。与 VS exe 的结果相比,吞吐量减少了一半。我不确定有多少差异是由虚拟机造成的,有多少是由 VS 通常产生比 gcc 更好的代码造成的,但我现在无法从 gcc 获得更准确的结果。

回答by brianegge

It used to be that Java ran about 10x slower than the same C++ code. Nowadays is closer to 2x slower. I think what your running into is just a fundamental part of Java. JVMs will get faster, especially as new JIT techniques are discovered, but you'll have a hard time out performing C.

过去,Java 的运行速度比相同的 C++ 代码慢 10 倍左右。现在更接近于慢 2 倍。我认为您遇到的只是 Java 的一个基本部分。JVM 将变得更快,尤其是在发现新的 JIT 技术时,但您将很难执行 C。

Have you tried alternative JVMs and/or compilers? I used to get better performance with JRocket, but less stability. Ditto for using jikesover javac.

您是否尝试过其他 JVM 和/或编译器?我曾经使用JRocket获得更好的性能,但稳定性较差。在 javac 上使用jikes 也是如此

回答by Gareth Davis

Perhaps the first thing today is work out where you are spending the most time? Can you run it through a profiler and see where the most time is being spent.

也许今天的第一件事就是找出你花最多时间的地方?您能否通过分析器运行它并查看花费最多时间的地方。

Possible improvements:

可能的改进:

  1. Use NIO to read the file in the fastest possible way
  2. Update the Hash in a separate thread. This is actually rather hard to do and isn't for the faint hearted as it involves safe publishing between threads. But if your profiling shows a significant amount of time being spent in hash algorithm it may make better use of the disk.
  1. 使用 NIO 以最快的方式读取文件
  2. 在单独的线程中更新哈希。这实际上很难做到,不适合胆小的人,因为它涉及线程之间的安全发布。但是,如果您的分析显示在哈希算法上花费了大量时间,则可能会更好地利用磁盘。

回答by Daniel Schneller

I suggest you use a profiler like JProfiler or the one integrated in Netbeans (free) to find out, where the time is actually spent and concentrate on that part.

我建议您使用像 JProfiler 这样的分析器或集成在 Netbeans 中的分析器(免费)来找出实际花费的时间并专注于该部分。

Just a wild guess - not sure if it will help - but have you tried the Server VM? Try starting the app with java -serverand see if that helps you. The server VM is more aggressive compiling Java code to native than the default client VM is.

只是一个疯狂的猜测-不确定它是否会有所帮助-但是您是否尝试过服务器虚拟机?尝试启动应用程序,java -server看看是否对您有帮助。服务器 VM 比默认客户端 VM 更积极地将 Java 代码编译为本机。

回答by Esko

Since you apparently have a working C++ implementation which is fast, you could build a JNIbridge and use the actual C++ implementation or maybe you could try not reinventing the wheel, especially since it's a big one and use a premade library such as BouncyCastlewhich has been made to solve all cryptographic needs of your program.

由于您显然有一个快速有效的 C++ 实现,您可以构建一个JNI桥并使用实际的 C++ 实现,或者您可以尝试不重新发明轮子,特别是因为它是一个大的并使用诸如BouncyCastle 之类的预制库,它具有旨在解决您程序的所有加密需求。

回答by bruno conde

I think this difference in performance might only be platform related. Try changing the buffer size and see if there are any improvements. If not, I would go with JNI (Java Native Interface). Just call the C++ implementation from Java.

我认为这种性能差异可能仅与平台有关。尝试更改缓冲区大小,看看是否有任何改进。如果没有,我会选择JNI (Java Native Interface)。只需从 Java 调用 C++ 实现即可。

回答by Claude Houle

The MAIN reason why your code is so slow is because you use a RandomAccessFile which always has been quite slow performance-wise. I suggest using a "BufferedInputStream" so that you may benefit from all the power of the OS-level caching for disk-i/o.

您的代码如此缓慢的主要原因是您使用的 RandomAccessFile 在性能方面总是很慢。我建议使用“BufferedInputStream”,以便您可以从磁盘 i/o 的操作系统级缓存的所有功能中受益。

The code should look something like:

代码应该类似于:

    public static byte [] hash(MessageDigest digest, BufferedInputStream in, int bufferSize) throws IOException {
    byte [] buffer = new byte[bufferSize];
    int sizeRead = -1;
    while ((sizeRead = in.read(buffer)) != -1) {
        digest.update(buffer, 0, sizeRead);
    }
    in.close();

    byte [] hash = null;
    hash = new byte[digest.getDigestLength()];
    hash = digest.digest();
    return hash;
}