java ZipInputStream(InputStream, Charset) 错误地解码 ZipEntry 文件名

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11276343/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 04:30:04  来源:igfitidea点击:

ZipInputStream(InputStream, Charset) decodes ZipEntry file name falsely

javacharacter-encodingzipinputstream

提问by kriegaex

Java 7 is supposed to fix an old problem with unpacking zip archives with character sets other than UTF-8. This can be achieved by constructor ZipInputStream(InputStream, Charset). So far, so good. I can unpack a zip archive containing file names with umlauts in them when explicitly setting an ISO-8859-1 character set.

Java 7 应该解决使用 UTF-8 以外的字符集解压缩 zip 档案的老问题。这可以通过构造函数来实现ZipInputStream(InputStream, Charset)。到现在为止还挺好。当明确设置 ISO-8859-1 字符集时,我可以解压缩包含带有变音符号的文件名的 zip 存档。

Buthere is the problem: When iterating over the stream using ZipInputStream.getNextEntry(), the entries have wrong special characters in their names. In my case the umlaut "ü" is replaced by a "?" character, which is obviously wrong. Does anybody know how to fix this? Obviously ZipEntryignores the Charsetof its underlying ZipInputStream. It looks like yet another zip-related JDK bug, but I might be doing something wrong as well.

问题在于:使用 迭代流时ZipInputStream.getNextEntry(),条目的名称中包含错误的特殊字符。在我的情况下,变音符号“ü”被替换为“?” 性格,这显然是错误的。有谁知道如何解决这个问题?显然ZipEntry忽略了Charset其底层的ZipInputStream。它看起来像是另一个与 zip 相关的 JDK 错误,但我也可能做错了什么。

...
zipStream = new ZipInputStream(
    new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
    Charset.forName("ISO-8859-1")
);
while ((zipEntry = zipStream.getNextEntry()) != null) {
    // wrong name here, something like "M?nchen" instead of "München"
    System.out.println(zipEntry.getName());
    ...
}

回答by kriegaex

OMG, I played around for two or so hours, but just five minutes after I finally posted the question here, I bumped into the answer: My zip file was not encoded with ISO-8859-1, but with Cp437. So the constructor call should be:

天哪,我玩了大约两个小时,但在我最终在这里发布问题后仅五分钟,我就遇到了答案:我的 zip 文件不是用 ISO-8859-1 编码的,而是用 Cp437 编码的。所以构造函数调用应该是:

zipStream = new ZipInputStream(
    new BufferedInputStream(new FileInputStream(archiveFile), BUFFER_SIZE),
    Charset.forName("Cp437")
);

Now it works like a charm. Sorry for bothering you anyway. I hope this helps someone else facing similar problems.

现在它就像一个魅力。还是对不起打扰了。我希望这可以帮助其他面临类似问题的人。