java 批量解压.gz文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/901003/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 14:17:12  来源:igfitidea点击:

decompress .gz file in batch

javagzipcompression

提问by Kapil D

I have 100 of .gz files which I need to de-compress. I have couple of questions

我有 100 个 .gz 文件需要解压缩。我有几个问题

a) I am using the code given at http://www.roseindia.net/java/beginners/JavaUncompress.shtmlto decompress the .gz file. Its working fine. Quest:- is there a way to get the file name of the zipped file. I know that Zip class of Java gives of enumeration of entery file to work upon. This can give me the filename, size etc stored in .zip file. But, do we have the same for .gz files or does the file name is same as filename.gz with .gz removed.

a) 我使用http://www.roseindia.net/java/beginners/JavaUncompress.shtml 中给出的代码来解压 .gz 文件。它的工作正常。任务:- 有没有办法获取压缩文件的文件名。我知道 Java 的 Zip 类提供了要处理的输入文件的枚举。这可以给我存储在 .zip 文件中的文件名、大小等。但是,对于 .gz 文件,我们是否有相同的文件名,或者文件名是否与删除了 .gz 的 filename.gz 相同。

b) is there another elegant way to decompress .gz file by calling the utility function in the java code. Like calling 7-zip application from your java class. Then, I don't have to worry about input/output stream.

b) 是否有另一种优雅的方法通过调用 java 代码中的实用程序函数来解压缩 .gz 文件。就像从你的 java 类调用 7-zip 应用程序一样。然后,我不必担心输入/输出流。

Thanks in advance. Kapil

提前致谢。卡皮尔

回答by fredarin

a) Zip is an archive format, while gzip is not. So an entry iterator does not make much sense unless (for example) your gz-files are compressed tar files. What you want is probably:

a) Zip 是一种存档格式,而 gzip 不是。因此,除非(例如)您的 gz 文件是压缩的 tar 文件,否则入口迭代器没有多大意义。你想要的大概是:

File outFile = new File(infile.getParent(), infile.getName().replaceAll("\.gz$", ""));

b) Do you only want to uncompress the files? If not you may be ok with using GZIPInputStream and read the files directly, i.e. without intermediate decompression.

b) 您只想解压缩文件吗?如果不是,您可以使用 GZIPInputStream 并直接读取文件,即无需中间解压缩。

But ok. Let's say you really onlywant to uncompress the files. If so, you could probably use this:

但是没问题。比方说,你真的需要解压缩文件。如果是这样,你可能会使用这个:

public static File unGzip(File infile, boolean deleteGzipfileOnSuccess) throws IOException {
    GZIPInputStream gin = new GZIPInputStream(new FileInputStream(infile));
    FileOutputStream fos = null;
    try {
        File outFile = new File(infile.getParent(), infile.getName().replaceAll("\.gz$", ""));
        fos = new FileOutputStream(outFile);
        byte[] buf = new byte[100000];
        int len;
        while ((len = gin.read(buf)) > 0) {
            fos.write(buf, 0, len);
        }

        fos.close();
        if (deleteGzipfileOnSuccess) {
            infile.delete();
        }
        return outFile; 
    } finally {
        if (gin != null) {
            gin.close();    
        }
        if (fos != null) {
            fos.close();    
        }
    }       
}

回答by Paul Morie

Regarding A, the gunzipcommand creates an uncompressed file with the original name minus the .gzsuffix. See the man page.

关于 A,该gunzip命令创建一个原始名称减去.gz后缀的未压缩文件。请参阅手册页

Regarding B, Do you need gunzip specifically, or will another compression algorithm do? There's a java portof the LZMA compression algorithm used by 7zip to create .7zfiles, but it will not handle .gzfiles.

关于B,您是否特别需要gunzip,或者其他压缩算法可以吗?7zip 使用 LZMA 压缩算法的java 端口来创建.7z文件,但它不会处理.gz文件。

回答by alamar

If you have a fixed number of files to decompress once, why don't you use existing tools for that? As Paul Morie noticed, gunzipcan do that: for i in *.gz; do gunzip $i; doneAnd it would automatically name them, stripping .gz$

如果您有固定数量的文件要解压缩一次,为什么不使用现有工具呢?正如 Paul Morie 所注意到的,gunzip可以做到这一点: for i in *.gz; do gunzip $i; done它会自动命名它们,剥离.gz$

On windows, try winrar, probably, or gunzipfrom http://unxutils.sf.net

在 Windows 上,可能尝试使用 winrar,或者gunzip来自http://unxutils.sf.net

回答by BobMcGee

GZip is normally used only on single files, so it generallydoes not contain information about individual files. To bundle multiple files into one compressed archive, they are first combined into an uncompressed Tar file (with info about individual contents), and then compressed as a single file. This combination is called a Tarball.

GZip 通常仅用于单个文件,因此它通常不包含有关单个文件的信息。要将多个文件捆绑到一个压缩档案中,首先将它们组合成一个未压缩的 Tar 文件(包含有关单个内容的信息),然后压缩为单个文件。这种组合称为 Tarball。

There are libraries to extract the individual file info from a Tar, just as with ZipEntries. One example.You will first have to extract the .gz file into a temporary file in order to use it, or at least feed the GZipInputStream into the Tar library.

有一些库可以从 Tar 中提取单个文件信息,就像 ZipEntries 一样。一个例子。您首先必须将 .gz 文件提取到一个临时文件中才能使用它,或者至少将 GZipInputStream 提供给 Tar 库。

You may also call 7-Zip from the command line using Java. 7-Zip command-line syntax is here: 7-Zip Command Line Syntax.Example of calling the command shell from Java: Executing shell commands in Java.You will have to call 7-Zip twice: once to extract the Tar from the .tar.gz or .tgz file, and again to extract the individual files from the Tar.

您也可以使用 Java 从命令行调用 7-Zip。7-Zip 命令行语法在这里:7-Zip 命令行语法。从 Java 调用命令 shell 的示例:在 Java 中执行 shell 命令。您必须调用 7-Zip 两次:一次从 .tar.gz 或 .tgz 文件中提取 Tar,再次从 Tar 中提取单个文件。

Or, you could just do the easy thing and write a brief shell script or batch file to do your decompression. There's no reason to hammer a square peg in a round hole -- this is what batch files are made for. As a bonus, you can also feed them parameters, reducing the complexity of a java command line execution considerably, while still letting java control execution.

或者,您可以做简单的事情并编写一个简短的 shell 脚本或批处理文件来进行解压缩。没有理由在圆孔中敲打方钉——这就是批处理文件的用途。作为奖励,您还可以为它们提供参数,从而显着降低 java 命令行执行的复杂性,同时仍然让 java 控制执行。

回答by Peter Lawrey

Have you tried

你有没有尝试过

gunzip *.gz

回答by Garnet Ulrich

.gz files (gzipped) can store the filename of a compressed file. So for example FuBar.doc can be saved inside myDocument.gz and with appropriate uncompression, the file can be restored to the filename FuBar.doc. Unfortunately, java.util.zip.GZIPInputStream does not support any way of reading the filename even if it is stored inside the archive.

.gz 文件(gzipped)可以存储压缩文件的文件名。因此,例如 FuBar.doc 可以保存在 myDocument.gz 中,通过适当的解压缩,文件可以恢复为文件名 FuBar.doc。不幸的是, java.util.zip.GZIPInputStream 不支持读取文件名的任何方式,即使它存储在存档中。