Java GZIPInputStream 逐行读取

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1080381/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 23:18:03  来源:igfitidea点击:

GZIPInputStream reading line by line

javafile-iofilereadergzipinputstream

提问by Kapil D

I have a file in .gz format. The java class for reading this file is GZIPInputStream. However, this class doesn't extend the BufferedReader class of java. As a result, I am not able to read the file line by line. I need something like this

我有一个 .gz 格式的文件。读取这个文件的java类是GZIPInputStream。但是,这个类没有扩展java的BufferedReader类。结果,我无法逐行读取文件。我需要这样的东西

reader  = new MyGZInputStream( some constructor of GZInputStream) 
reader.readLine()...

I though of creating my class which extends the Reader or BufferedReader class of java and use GZIPInputStream as one of its variable.

我虽然创建了扩展 java 的 Reader 或 BufferedReader 类并使用 GZIPInputStream 作为其变量之一的类。

import java.io.BufferedReader;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.io.Reader;
import java.util.zip.GZIPInputStream;

public class MyGZFilReader extends Reader {

    private GZIPInputStream gzipInputStream = null;
    char[] buf = new char[1024];

    @Override
    public void close() throws IOException {
        gzipInputStream.close();
    }

    public MyGZFilReader(String filename)
               throws FileNotFoundException, IOException {
        gzipInputStream = new GZIPInputStream(new FileInputStream(filename));
    }

    @Override
    public int read(char[] cbuf, int off, int len) throws IOException {
        // TODO Auto-generated method stub
        return gzipInputStream.read((byte[])buf, off, len);
    }

}

But, this doesn't work when I use

但是,这在我使用时不起作用

BufferedReader in = new BufferedReader(
    new MyGZFilReader("F:/gawiki-20090614-stub-meta-history.xml.gz"));
System.out.println(in.readLine());

Can someone advice how to proceed ..

有人可以建议如何进行..

采纳答案by erickson

The basic setup of decorators is like this:

装饰器的基本设置是这样的:

InputStream fileStream = new FileInputStream(filename);
InputStream gzipStream = new GZIPInputStream(fileStream);
Reader decoder = new InputStreamReader(gzipStream, encoding);
BufferedReader buffered = new BufferedReader(decoder);

The key issue in this snippet is the value of encoding. This is the character encoding of the text in the file. Is it "US-ASCII", "UTF-8", "SHIFT-JIS", "ISO-8859-9", …? there are hundreds of possibilities, and the correct choice usually cannot be determined from the file itself. It must be specified through some out-of-band channel.

此代码段中的关键问题是 的值encoding。这是文件中文本的字符编码。是“US-ASCII”、“UTF-8”、“SHIFT-JIS”、“ISO-8859-9”……?有数百种可能性,通常无法从文件本身确定正确的选择。它必须通过一些带外通道指定。

For example, maybe it's the platform default. In a networked environment, however, this is extremely fragile. The machine that wrote the file might sit in the neighboring cubicle, but have a different default file encoding.

例如,也许它是平台默认值。然而,在网络环境中,这非常脆弱。写入文件的机器可能位于相邻的隔间中,但具有不同的默认文件编码。

Most network protocols use a header or other metadata to explicitly note the character encoding.

大多数网络协议使用标头或其他元数据来明确记录字符编码。

In this case, it appears from the file extension that the content is XML. XML includes the "encoding" attribute in the XML declaration for this purpose. Furthermore, XML should really be processed with an XML parser, not as text. Reading XML line-by-line seems like a fragile, special case.

在这种情况下,从文件扩展名中可以看出内容是 XML。为此,XML 在 XML 声明中包含“编码”属性。此外,XML 应该真正用 XML 解析器处理,而不是作为文本处理。逐行读取 XML 似乎是一种脆弱的特殊情况。

Failing to explicitly specify the encoding is against the second commandment.Use the default encoding at your peril!

未能明确指定编码是违反第二条诫命的。使用默认编码有风险!

回答by ChssPly76

GZIPInputStream gzip = new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"));
BufferedReader br = new BufferedReader(new InputStreamReader(gzip));
br.readLine();

回答by Arumugam Mathiazhagan

BufferedReader in = new BufferedReader(new InputStreamReader(
        new GZIPInputStream(new FileInputStream("F:/gawiki-20090614-stub-meta-history.xml.gz"))));

String content;

while ((content = in.readLine()) != null)

   System.out.println(content);

回答by Memin

You can use the following method in a util class, and use it whenever necessary...

您可以在 util 类中使用以下方法,并在必要时使用它...

public static List<String> readLinesFromGZ(String filePath) {
    List<String> lines = new ArrayList<>();
    File file = new File(filePath);

    try (GZIPInputStream gzip = new GZIPInputStream(new FileInputStream(file));
            BufferedReader br = new BufferedReader(new InputStreamReader(gzip));) {
        String line = null;
        while ((line = br.readLine()) != null) {
            lines.add(line);
        }
    } catch (FileNotFoundException e) {
        e.printStackTrace(System.err);
    } catch (IOException e) {
        e.printStackTrace(System.err);
    }
    return lines;
}

回答by Tamer

here is with one line

这是一行

try (BufferedReader br = new BufferedReader(
        new InputStreamReader(
           new GZIPInputStream(
              new FileInputStream(
                 "F:/gawiki-20090614-stub-meta-history.xml.gz"))))) 
     {br.readLine();}