使用 Java 删除 BOM 字符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21891578/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Removing BOM characters using Java
提问by James Raitsev
What needs to happen to a string using Java to be an equivalent of vi
s
使用 Java 的字符串需要做什么才能等效于vi
s
:set nobomb
Assume that BOM
comes from the file I am reading.
假设BOM
来自我正在阅读的文件。
采纳答案by Christian Kuetbach
Java does not handle BOM properly. In fact Java handles a BOM like every other char.
Java 不能正确处理 BOM。事实上,Java 像处理其他所有字符一样处理 BOM。
Found this:
发现这个:
http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html
http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html
public static final String UTF8_BOM = "\uFEFF";
private static String removeUTF8BOM(String s) {
if (s.startsWith(UTF8_BOM)) {
s = s.substring(1);
}
return s;
}
May be I would use apache IO instead:
可能我会改用 apache IO:
http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/input/BOMInputStream.html
http://commons.apache.org/proper/commons-io/apidocs/org/apache/commons/io/input/BOMInputStream.html
回答by Theresia Sofia Snow
For UTF-8 the BOM is: 0xEF, 0xBB, 0xBF
对于 UTF-8,BOM 为:0xEF、0xBB、0xBF