使用 Java 压缩 ZIP 中的大文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1770776/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 17:54:26  来源:igfitidea点击:

To Compress a big file in a ZIP with Java

javaziplarge-files

提问by robob

I have the need to compress a one Big file (~450 Mbyte) through the Java class ZipOutputStream. This big dimension causes a problem of "OutOfMemory" error of my JVM Heap Space. This happens because the "zos.write(...)" method stores ALL the file content to compress in an internal byte array before compressing it.

我需要通过 Java 类 ZipOutputStream 压缩一个大文件(~450 Mbyte)。这个大维度会导致我的 JVM 堆空间出现“OutOfMemory”错误的问题。发生这种情况是因为“zos.write(...)”方法在压缩之前将所有要压缩的文件内容存储在内部字节数组中。

            origin = new BufferedInputStream(fi, BUFFER);
        ZipEntry entry = new ZipEntry(filePath);
        zos.putNextEntry(entry);

        int count;
        while ((count = origin.read(data, 0, BUFFER)) != -1)
        {
            zos.write(data, 0, count);
        }
        origin.close();

The natural solution will be to enlarge the heap memory space of the JVM, but I would like to know if there is a method to write this data in a streaming manner. I do not need an high compression rate so I could change the algorithm too.

自然的解决方案是扩大JVM的堆内存空间,但是我想知道是否有一种方法可以将这些数据以流式方式写入。我不需要高压缩率,因此我也可以更改算法。

does anyone have an idea about it?

有没有人对此有想法?

回答by jarnbjo

According to your comment to Sam's response, you have obviously created a ZipOutputStream, which wraps a ByteArrayOutputStream. The ByteArrayOutputStream of course caches the compressed result in memory. If you want it written to disk, you have to wrap the ZipOutputStream around a FileOutputStream.

根据您对 Sam 回复的评论,您显然创建了一个 ZipOutputStream,它包装了一个 ByteArrayOutputStream。ByteArrayOutputStream 当然将压缩结果缓存在内存中。如果要将其写入磁盘,则必须将 ZipOutputStream 包装在 FileOutputStream 周围。

回答by Carl Smotricz

There's a library called TrueZipthat I've used with good success in the past to do this kind of thing.

有一个名为TrueZip的库,我过去曾用它来做这种事情并取得了成功。

I cannot guarantee it does better on the buffering front. I do know that it does a lot of stuff with its own coding rather than depending on the JDK's Zip API.

我不能保证它在缓冲方面做得更好。我知道它用自己的编码完成了很多事情,而不是依赖于 JDK 的 Zip API。

So it's worth a try, in my opinion.

所以我认为值得一试。

回答by Sam Barnum

ZipOutputStream is stream-based, it doesn't hold onto memory. Your BUFFER may be too large.

ZipOutputStream 是基于流的,它不占用内存。您的 BUFFER 可能太大。

回答by cjstehno

I wonder if it's because you are storing the content in a ZipEntry, perhaps it basically loads all of its content before writing out the ZipEntry. Do you have to use Zip? If it's just one data stream you need to compress you might look into the GZIPOutputStream instead. I believe that it would not have the same problem.

我想知道是否是因为您将内容存储在 ZipEntry 中,也许它在写出 ZipEntry 之前基本上加载了所有内容。你必须使用 Zip 吗?如果它只是一个您需要压缩的数据流,您可能会查看 GZIPOutputStream。我相信它不会有同样的问题。

Hope this helps.

希望这可以帮助。