使用 Java 删除 BOM 字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21891578/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 11:17:52  来源:igfitidea点击:

Removing BOM characters using Java

javavibyte-order-mark

提问by James Raitsev

What needs to happen to a string using Java to be an equivalent of vis

使用 Java 的字符串需要做什么才能等效于vis

:set nobomb

Assume that BOMcomes from the file I am reading.

假设BOM来自我正在阅读的文件。

采纳答案by Christian Kuetbach

Java does not handle BOM properly. In fact Java handles a BOM like every other char.

Java 不能正确处理 BOM。事实上,Java 像处理其他所有字符一样处理 BOM。

Found this:

发现这个:

http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html

http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html

public static final String UTF8_BOM = "\uFEFF";

private static String removeUTF8BOM(String s) {
    if (s.startsWith(UTF8_BOM)) {
        s = s.substring(1);
    }
    return s;
}

May be I would use apache IO instead:

可能我会改用 apache IO:

http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/input/BOMInputStream.html

http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/input/BOMInputStream.html

回答by Theresia Sofia Snow

For UTF-8 the BOM is: 0xEF, 0xBB, 0xBF

对于 UTF-8,BOM 为:0xEF、0xBB、0xBF