Java - 读取 BZ2 文件并即时解压缩/解析
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4834721/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Java - Read BZ2 file and uncompress/parse on the fly
提问by user587363
I have a fairly large BZ2 file that with several text files in it. Is it possible for me to use Java to uncompress certain files inside the BZ2 file and uncompress/parse the data on the fly? Let's say that a 300mb BZ2 file contains 1 GB of text. Ideally, I'd like my java program to say read 1 mb of the BZ2 file, uncompress it on the fly, act on it and keep reading the BZ2 file for more data. Is that possible?
我有一个相当大的 BZ2 文件,里面有几个文本文件。我是否可以使用 Java 来解压缩 BZ2 文件中的某些文件并动态解压缩/解析数据?假设一个 300mb 的 BZ2 文件包含 1 GB 的文本。理想情况下,我希望我的 Java 程序读取 1 mb 的 BZ2 文件,即时解压缩,然后继续读取 BZ2 文件以获取更多数据。那可能吗?
Thanks
谢谢
回答by Chilly
The commons-compress library from apache is pretty good. Here's their samples page: http://commons.apache.org/proper/commons-compress/examples.html
来自 apache 的 commons-compress 库非常好。这是他们的示例页面:http: //commons.apache.org/proper/commons-compress/examples.html
Here's the latest maven snippet:
这是最新的 maven 片段:
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.10</version>
</dependency>
And here's my util method:
这是我的 util 方法:
public static BufferedReader getBufferedReaderForCompressedFile(String fileIn) throws FileNotFoundException, CompressorException {
FileInputStream fin = new FileInputStream(fileIn);
BufferedInputStream bis = new BufferedInputStream(fin);
CompressorInputStream input = new CompressorStreamFactory().createCompressorInputStream(bis);
BufferedReader br2 = new BufferedReader(new InputStreamReader(input));
return br2;
}
回答by martineno
The Ant project contains a bzip2library. Which has a org.apache.tools.bzip2.CBZip2InputStream
class. You can use this class to decompress the bzip2 file on the fly - it just extends the standard Java InputStream
class.
Ant 项目包含一个bzip2库。其中有一个org.apache.tools.bzip2.CBZip2InputStream
类。您可以使用此类即时解压缩 bzip2 文件 - 它只是扩展了标准 JavaInputStream
类。