在 Eclipse 中使用 utf-8 文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2905582/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 14:47:08  来源:igfitidea点击:

Working with utf-8 files in Eclipse

eclipseutf-8byte-order-mark

提问by Pablo Cabrera

Quite straight forward question. Is there a way to configure Eclipse to work with text files encoded with utf-8 with and without the BOM?

很直接的问题。有没有办法将 Eclipse 配置为使用带有和不带有 BOM 的 utf-8 编码的文本文件?

So far I've used eclipse with utf-8 encoding and it works, but when I try to edit a file generated by another editor that includes the BOM, Eclipse doesn't handle it properly, it 'shows an invisible character' at the begining of the file (the BOM). Is there a way to make Eclipse understand utf-8 encoded files with BOM?

到目前为止,我已经使用了带有 utf-8 编码的 eclipse 并且它可以工作,但是当我尝试编辑由另一个包含 BOM 的编辑器生成的文件时,Eclipse 无法正确处理它,它在文件的开头(BOM)。有没有办法让 Eclipse 理解带有 BOM 的 utf-8 编码文件?

采纳答案by VonC

Both bug 78455("Provide an option to force writing a BOM to UTF-8 files") and bug 136854don't leave much hope for such an option.

这两个错误78455(“提供一个选项,以力写BOM为UTF-8文件”)和错误136854不要留下太大的希望了这样的选择。

The support for encoding in the workspace is based on what is available from Java.
For any given resource in the workspace, it is possible to obtain a charset string that can be used with any Java APIs that take charset strings.
Examples are:

  • 'US-ASCII',
  • 'UTF-8',
  • 'Cp1252',
  • 'UTF-16' (Big Endian, BOM inserted automatically),
  • 'UTF-16BE' (Big Endian, BOM not inserted automatically),
  • 'UTF-16LE' (Little Endian, BOM not inserted automatically).

For Java encodings, except for the 'UTF-16' encoding, BOMs are not inserted (when writing) or discarded (when reading) for free.
Even if this is puzzling to end users, this is how all Java applications work.
If applications want to support creating UTF-8 files with BOMs to match their users' expectations, they need to provide such capability on their own(as neither Java nor the Resources model will help with that).
Eclipse does provide some improvements towards detecting BOMs, but not with generating or skipping them.

对工作区中编码的支持基于 Java 中的可用内容
对于工作区中的任何给定资源,都可以获得一个字符集字符串,该字符串可与任何采用字符集字符串的 Java API 一起使用。
例子是:

  • ' US-ASCII',
  • ' UTF-8',
  • ' Cp1252',
  • ' UTF-16' (Big Endian, BOM 自动插入),
  • ' UTF-16BE' (Big Endian, BOM 不自动插入),
  • ' UTF-16LE'(Little Endian,不会自动插入 BOM)。

对于 Java 编码,除“UTF-16”编码外,不会免费插入(写入时)或丢弃(读取时)BOM
即使这让最终用户感到困惑,但这就是所有 Java 应用程序的工作方式。
如果应用程序想要支持创建带有 BOM 的 UTF-8 文件以满足用户的期望,他们需要自己提供这种功能(因为 Java 和 Resources 模型都无法提供帮助)。
Eclipse 确实在检测 BOM 方面提供了一些改进,但没有生成或跳过它们。