java 从包含大量文件的 zip 文件中提取 1 个文件的最快方法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5484158/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the fastest way to extract 1 file from a zip file which contain a lot of file?
提问by lamwaiman1988
I tried the java.util.zippackage, it is too slow.
我试过java.util.zip包,太慢了。
Then I found LZMA SDKand 7z jbindingbut they are also lacking something. The LZMA SDK does not provide a kind of documentation/tutorial of how-to-use, it is very frustrating. No javadoc.
然后我找到了LZMA SDK和7z jbinding但它们也缺少一些东西。LZMA SDK 没有提供一种使用方法的文档/教程,非常令人沮丧。没有 javadoc。
While the 7z jbinding does not provide a simple way to extract only 1 file, however, it only provide way to extract all the content of the zip file. Moreover, it does not provide a way to specify a location to place the unzipped file.
虽然 7z jbinding 没有提供一种简单的方法来只提取 1 个文件,但是它只提供了提取 zip 文件所有内容的方法。此外,它没有提供指定放置解压缩文件的位置的方法。
Any idea please???
有什么想法吗???
回答by WhiteFang34
What does your code with java.util.zip
look like and how big of a zip file are you dealing with?
你的代码是java.util.zip
什么样的,你处理的 zip 文件有多大?
I'm able to extract a 4MB entry out of a 200MB zip file with 1,800 entries in roughly a second with this:
我能够在大约一秒钟内从一个 200MB 的 zip 文件中提取一个 4MB 的条目,其中包含 1,800 个条目:
OutputStream out = new FileOutputStream("your.file");
FileInputStream fin = new FileInputStream("your.zip");
BufferedInputStream bin = new BufferedInputStream(fin);
ZipInputStream zin = new ZipInputStream(bin);
ZipEntry ze = null;
while ((ze = zin.getNextEntry()) != null) {
if (ze.getName().equals("your.file")) {
byte[] buffer = new byte[8192];
int len;
while ((len = zin.read(buffer)) != -1) {
out.write(buffer, 0, len);
}
out.close();
break;
}
}
回答by flavio.donze
I have not benchmarked the speed but with java 7 or greater, I extract a file as follows.
I would imagine that it's faster than the ZipFileAPI:
我没有对速度进行基准测试,但是使用 java 7 或更高版本,我提取了一个文件,如下所示。
我想它比ZipFileAPI更快:
A short example extracting META-INF/MANIFEST.MF
from a zip file test.zip
:
META-INF/MANIFEST.MF
从 zip 文件中提取的简短示例test.zip
:
// file to extract from zip file
String file = "MANIFEST.MF";
// location to extract the file to
File outputLocation = new File("D:/temp/", file);
// path to the zip file
Path zipFile = Paths.get("D:/temp/test.zip");
// load zip file as filesystem
try (FileSystem fileSystem = FileSystems.newFileSystem(zipFile, null)) {
// copy file from zip file to output location
Path source = fileSystem.getPath("META-INF/" + file);
Files.copy(source, outputLocation.toPath());
}
回答by kdgregory
Use a ZipFilerather than a ZipInputStream.
使用ZipFile而不是ZipInputStream。
Although the documentation does not indicate this (it's in the docs for JarFile
), it should use random-access file operations to read the file. Since a ZIPfile contains a directory at a known location, this means a LOT less IO has to happen to find a particular file.
尽管文档没有指出这一点(它在 的文档中JarFile
),但它应该使用随机访问文件操作来读取文件。由于 ZIP 文件在已知位置包含一个目录,这意味着查找特定文件所需的 IO 减少了很多。
Some caveats: to the best of my knowledge, the Sun implementation uses a memory-mapped file. This means that your virtual address space has to be large enough to hold the file as well as everything else in your JVM. Which may be a problem for a 32-bit server. On the other hand, it may be smart enough to avoid memory-mapping on 32-bit, or memory-map just the directory; I haven't tried.
一些警告:据我所知,Sun 实现使用内存映射文件。这意味着您的虚拟地址空间必须足够大以容纳文件以及 JVM 中的所有其他内容。这对于 32 位服务器来说可能是个问题。另一方面,它可能足够聪明,可以避免在 32 位上进行内存映射,或者只对目录进行内存映射;我没试过
Also, if you're using multiple files, be sure to use a try
/finally
to ensure that the file is closed after use.
此外,如果您使用多个文件,请务必使用try
/finally
以确保文件在使用后关闭。