如何使用 JavaScript 在 HTML 标题中正确插入 unicode?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/12114477/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I correctly insert unicode in an HTML title using JavaScript?
提问by BenG
I'm seeing some weird behavior when I'm setting the title of an HTML page using JavaScript. If I insert html character references directly into the title the Unicode renders correctly, for instance:
当我使用 JavaScript 设置 HTML 页面的标题时,我看到了一些奇怪的行为。如果我将 html 字符引用直接插入到标题中,Unicode 将正确呈现,例如:
<title>吧出</title>
But if I attempt to use html characters references via JavaScript, something seems to be converting the & to (& amp ;) (separating them so SO doesn't just turn it back into ampersand) and thus breaking the encoding, causing it to be rendered as the full coded string:
但是,如果我尝试通过 JavaScript 使用 html 字符引用,似乎有些东西正在将 & 转换为 (& amp ;)(将它们分开,这样就不会将其转回&符号),从而破坏编码,导致它成为呈现为完整的编码字符串:
function execTitleChange() {
document.title = "吧出";
}
(I should note that this is a little bit of speculation; when I introspect the DOM using Firebug after executing this JavaScript function, that's where I see the & instead of &.)
(我应该注意,这只是一点推测;当我在执行此 JavaScript 函数后使用 Firebug 内省 DOM 时,我看到的是 & 而不是 &。)
If I use \u encoded Unicode characters when setting the value from JavaScript then everything works correctly again:
如果我在从 JavaScript 设置值时使用 \u 编码的 Unicode 字符,那么一切都会再次正常工作:
function execTitleChange() {
document.title = "\u5427\u51fa";
}
The fact that \u encoded characters work kind of makes sense to me since I think that's how JavaScript represents Unicode characters but I'm stumped as to why the behavior would be different when using the html character references.
\u 编码字符工作的事实对我来说很有意义,因为我认为这就是 JavaScript 表示 Unicode 字符的方式,但我很难理解为什么在使用 html 字符引用时行为会有所不同。
回答by Pointy
JavaScript string constants are parsed by the JavaScript parser. Text inside HTML tags is parsed by the HTML parser. The two languages (and, by extension, their parsers) are different, and in particular they have different ways of representing characters by character code.
JavaScript 字符串常量由 JavaScript 解析器解析。HTML 标签内的文本由 HTML 解析器解析。这两种语言(以及它们的解析器)是不同的,特别是它们通过字符代码表示字符的方式不同。
Thus, what you've discovered is the way reality actually is :-) Use the \u
escape notation in JavaScript, and use HTML entities (&#nnnn;
) in HTML/XML.
因此,您发现的是现实实际上是这样的 :-)\u
在 JavaScript 中使用转义符号,并&#nnnn;
在 HTML/XML 中使用 HTML 实体 ( )。
edit— now the situation can get even more confusing when you're talking about creating/inserting HTML fromJavaScript. When you use .innerHTML
to update the DOM from JavaScript, then you are basically handing over HTML source code to the HTML parser for interpretation. For that reason, you can use either JavaScript \u
escapes or HTML entities, and things will work (excepting painful issues of character encoding mismatches etc).
编辑— 现在,当您谈论从JavaScript创建/插入 HTML 时,情况会变得更加混乱。当您使用.innerHTML
JavaScript 更新 DOM 时,您基本上是将 HTML 源代码交给 HTML 解析器进行解释。出于这个原因,您可以使用 JavaScript\u
转义符或 HTML 实体,一切都会奏效(除了字符编码不匹配等令人痛苦的问题)。
Finally, note that JavaScript also provides the String.fromCharCode()
function to construct strings from numeric character codes.
最后,请注意 JavaScript 还提供了String.fromCharCode()
从数字字符代码构造字符串的功能。
回答by Jukka K. Korpela
The best way to work with Unicode characters in JavaScript is to use the characters themselves, using an editor or other tool that can store them in UTF-8 encoding. You will avoid a lot of confusion. Naturally, you need to properly declare the character encoding of your .js or .html file.
在 JavaScript 中处理 Unicode 字符的最佳方法是使用字符本身,使用编辑器或其他可以将它们存储为 UTF-8 编码的工具。你会避免很多混乱。当然,您需要正确声明 .js 或 .html 文件的字符编码。
The construct 吧
has no special meaning in JavaScript; it is just eight Ascii characters. But if your JavaScript code has been embedded into an HTML document, then it will be processed by HTML rules before passing to the JavaScript interpreter. And the rules vary by HTML version. Yet another reason to avoid such constructs.
该构造吧
在 JavaScript 中没有特殊意义;它只有八个 Ascii 字符。但是如果你的 JavaScript 代码已经嵌入到一个 HTML 文档中,那么它会在传递给 JavaScript 解释器之前由 HTML 规则处理。并且规则因 HTML 版本而异。避免这种结构的另一个原因。
So just write
所以只要写
document.title = "吧出";
(Of course, there are very few situations where you should change the title
element content—which is crucial to search engines and many other purposes—in JavaScript, instead of setting it in HTML. But that's beside the point.)
(当然,在极少数情况下,您应该title
在 JavaScript 中更改元素内容(这对搜索引擎和许多其他目的至关重要),而不是在 HTML 中进行设置。但这无关紧要。)