Javascript 如何在 INDIVIDUAL JS 文件中声明字符编码?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/8833231/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 07:43:35  来源:igfitidea点击:

How to declare character encoding in an INDIVIDUAL JS file?

javascriptencodingcharacter-encoding

提问by weilou

We can declare the character encoding in an INDIVIDUAL CSS file by codes below:

我们可以通过以下代码在 INDIVIDUAL CSS 文件中声明字符编码:

@charset "UTF-8";

@charset "UTF-8";

My question is:

我的问题是:

How to declare character encoding in an INDIVIDUAL JS file?

如何在 INDIVIDUAL JS 文件中声明字符编码?

If I send a JS file to my friend, I hope he (she) can understand this JS file's character encoding from codes themselves when he (she) starts to browse or edit this JS file.

如果我给我的朋友发送一个JS文件,我希望他(她)在开始浏览或编辑这个JS文件时,能够自己从代码中理解这个JS文件的字符编码。

Thank you!

谢谢!

采纳答案by T.J. Crowder

You can't. You can, however, define it in the scripttagthat brings the file into the page, using the charsetattribute. This must match the charset, if any, in the Content-Typethat you serve the file with. Quoting:

你不能。但是,您可以使用属性在将文件引入页面的script标记中定义它。这必须与您提供文件的中的匹配(如果有)。引用:charsetcharsetContent-Type

The charsetattribute gives the character encoding of the external script resource. The attribute must not be specified if the srcattribute is not present. If the attribute is set, its value must be a valid character encoding name, must be an ASCII case-insensitive matchfor the preferred MIME namefor that encoding, and must match the encoding given in the charsetparameter of the Content-Type metadataof the external file, if any. [IANACHARSET]

charset属性给出了外部脚本资源的字符编码。如果该src属性不存在,则不得指定该属性。如果属性被设置时,其值必须是一个有效的字符编码名称,必须是一个ASCII不区分大小写匹配优选MIME名称为编码,并且必须在给定的编码相匹配charset的的参数内容类型的元数据的外部文件,如果有的话。[IANACHARSET]

Re your edit:

重新编辑:

If I send a JS file to my friend, I hope he (she) can understand this JS file's character encoding from codes themselves when he (she) starts to browser or edit this JS file.

如果我给我的朋友发送一个JS文件,我希望他(她)在他(她)开始浏览或编辑这个JS文件时,能够自己从代码中理解这个JS文件的字符编码。

For that, you'll pretty much just have to tell him/her. If the file is in UTF-8 or Windows-1252 or ISO 8859-1, unfortunately there's no in-file indicator of the encoding available, so I'd include a comment at the beginning along the lines of:

为此,你几乎只需要告诉他/她。如果文件是 UTF-8 或 Windows-1252 或 ISO 8859-1,不幸的是没有可用编码的文件内指示符,所以我会在开头加上一条注释:

// Encoding: UTF-8

If you're using UTF-16 or UTF-32, though, you should be able to tell your editor to use a BOM, which other editors should see and understand (if they're Unicode-aware editors). This would typically only apply if you were writing your comments in a text (language) requiring lots of multi-byte characters, and if you have a high ratio of comments to code (since the code is written with western text), although of course you're welcome to use any encoding you like. It's just that if the ratio of comments to code is low, you're probably better off sticking with UTF-8 even if the comments are in a text requiring lots of four-byte characters, because the code will only require one byte per character. (Whereas in UTF-16, you might have more two-byte instead of four-byte characters in your comments, but the code would always require two bytes per character; and in UTF-32, four bytes per character. So on the whole the file may well be larger even though the comments take less space. But here I'm probably telling you things you already know far better than I, if I'm guessing correctly about your reasons for the question.)

但是,如果您使用的是 UTF-16 或 UTF-32,您应该能够告诉您的编辑器使用BOM,其他编辑器应该看到并理解(如果他们是 Unicode 感知编辑器)。这通常仅适用于您使用需要大量多字节字符的文本(语言)编写注释,并且注释与代码的比率很高(因为代码是用西方文本编写的),尽管当然欢迎您使用任何您喜欢的编码。只是如果注释与代码的比例较低,即使注释位于需要大量四字节字符的文本中,您也可能最好坚持使用 UTF-8,因为代码每个字符只需要一个字节. (而在 UTF-16 中,您的注释中可能有更多的两字节而不是四字节字符,但代码总是需要每个字符两个字节;而在 UTF-32 中,每个字符需要四个字节。所以总的来说,即使注释占用的空间更少,文件也可能更大。但在这里我可能会告诉你一些你已经比我更了解的事情,如果我对你提出这个问题的原因的猜测是正确的。)

回答by Jukka K. Korpela

There is no JavaScript construct for declaring the encoding in the file itself, the way you can do in CSS. The encoding should be communicated to the recipients when delivering the data. When sending files as e-mail attachments, your e-mail program might or might not include them with Content-Type headers that indicate the encoding (but it might have hard time in figuring out what the encoding might be).

没有用于在文件本身中声明编码的 JavaScript 构造,您可以在 CSS 中这样做。传送数据时,应将编码传达给接收者。将文件作为电子邮件附件发送时,您的电子邮件程序可能会或可能不会将文件包含在指示编码的 Content-Type 标头中(但可能很难确定可能的编码是什么)。

You can the a Byte Order Mark (BOM) at the start of a UTF-8 encoded file, too. Although there is no byte order issue in UTF-8, the BOM acts as a useful indicator–a file that starts with bytes that constitute a BOM in UTF-8 encoding is most probably UTF-8 encoded. This is why programs may well infer the encoding, in the absence of other indication. This is of course not 100% reliable, but a useful thing.

您也可以在 UTF-8 编码文件的开头添加字节顺序标记 (BOM)。尽管 UTF-8 中没有字节顺序问题,但 BOM 充当了一个有用的指标——以构成 UTF-8 编码的 BOM 的字节开头的文件很可能是 UTF-8 编码的。这就是为什么程序可以在没有其他指示的情况下很好地推断编码的原因。这当然不是 100% 可靠,但很有用。

Many text editors have the option of saving your file as “UTF-8 encoded with a BOM”.

许多文本编辑器可以选择将文件保存为“使用 BOM 编码的 UTF-8”。

(On web pages, the BOM was once regarded as a risk, since browsers were observed to treat it as character data. These days, the BOM even in UTF-8 is useful rather than a risk.)

(在网页上,BOM 曾经被视为一种风险,因为观察到浏览器将其视为字符数据。如今,即使是 UTF-8 的 BOM 也很有用,而不是一种风险。)

回答by David Eldridge

If you are interested in indicating the file's encoding in a human-readable way, T.J. Crowder'sidea (adding a comment to the file like // Encoding: UTF-8) is just the thing. And as Jukka K. Korpelapointed out, you can use the BOM as well.

如果您有兴趣以人类可读的方式指示文件的编码,TJ Crowder 的想法(向文件添加注释// Encoding: UTF-8)就是这样。正如Jukka K. Korpela指出的那样,您也可以使用 BOM。

But if you want a machine-readable way to indicate charset that is declared in the document there are a couple of other ways:

但是,如果您想要一种机器可读的方式来指示在文档中声明的字符集,还有其他几种方式:

For instance, on an Apache httpd server you might use any of the following declarations:

例如,在 Apache httpd 服务器上,您可以使用以下任何声明:

  1. AddDefaultCharset UTF-8
  2. AddCharset UTF-8 .js
  3. AddType 'application/javascript; charset=UTF-8' js*
  1. AddDefaultCharset UTF-8
  2. AddCharset UTF-8 .js
  3. AddType 'application/javascript; charset=UTF-8' js*

* I am not interested in making the case for using "application/javascript"over "text/javascript". But if you are interested in knowing why one or the other might be preferable, cf. https://stackoverflow.com/a/4101763/1070047. Given the topic, though, application/javascriptseems quite appropriate (especially if you are intending to use a BOM, because it indicates that the code should be treated as a binary).

* 我对使用"application/javascript"over的理由不感兴趣"text/javascript"。但是,如果您有兴趣知道为什么其中一个可能更可取,请参阅。https://stackoverflow.com/a/4101763/1070047。不过,鉴于该主题application/javascript似乎很合适(特别是如果您打算使用 BOM,因为它表明应将代码视为二进制代码)。

If the code will be interpreted/processed/compiled server-side (e.g. PHP), you can set headers in the document, e.g.…

如果代码将被解释/处理/编译服务器端(例如 PHP),您可以在文档中设置标题,例如...

header("Content-Type: application/javascript; charset=utf-8");

At least within PHP, be sure to add that header statement before any output takes place.

至少在 PHP 中,确保在任何输出发生之前添加该标题语句。

Lastly, when determining which declaration to use, consider that (when understood/honored, i.e. not in IE) the BOM has greater authority than document headers. And both take precedence over the linked/sourced charset declarations (like <script type="application/javascript" src="script.js" charset="utf-8"></script>).

最后,在确定使用哪个声明时,请考虑(当理解/尊重时,即不在 IE 中)BOM 比文档标题具有更大的权限。并且两者都优先于链接/来源的字符集声明(如<script type="application/javascript" src="script.js" charset="utf-8"></script>)。