在 html 标签属性值中转义

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9187946/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 22:21:28  来源:igfitidea点击:

escaping inside html tag attribute value

htmlescaping

提问by Myforwik

I am having trouble understanding how escaping works inside html tag attribute values that are javascript.

我无法理解转义在 html 标签属性值中是如何工作的,这些值是 javascript。

I was lead to believe that you should always escape & ' " < > . So for javascript as an attribute value I tried:

我相信你应该总是转义 & '" < > 。所以对于 javascript 作为属性值我试过:

<a href="javascript:alert(&apos;Hello&apos;);"></a>

It doesn't work. However:

它不起作用。然而:

<a href="javascript:alert(&#39;Hello&#39;);"></a>

and

<a href="javascript:alert('Hello');"></a>

does work in all browsers!

确实适用于所有浏览器!

Now I am totally confused. If all my attribute values are enclosed in double quotes, does this mean I do not have to escape single quotes? Or is apos and ascii 39 technically different characters? Such that javascript requires ascii 39, but not apos?

现在我完全糊涂了。如果我的所有属性值都用双引号括起来,这是否意味着我不必转义单引号?或者 apos 和 ascii 39 在技术上是不同的字符?这样 javascript 需要 ascii 39,但不需要 apos?

回答by Jukka K. Korpela

There are two types of “escapes” involved here, HTML and JavaScript. When interpreting an HTML document, the HTML escapes are parsed first.

这里涉及两种类型的“转义”,HTML 和 JavaScript。解释 HTML 文档时,首先解析 HTML 转义符。

As far as HTML is considered, the rules within an attribute value are the same as elsewhere plus one additional rule:

就 HTML 而言,属性值内的规则与其他地方相同,外加一个附加规则:

  • The less-than character <should be escaped. Usually &lt;is used for this. Technically, depending on HTML version, escaping is not always required, but it has always been good practice.
  • The ampersand &should be escaped. Usually &amp;is used for this. This, too, is not always obligatory, but it is simpler to do it always than to learn and remember when it is required.
  • The character that is used as delimiters around the attribute value must be escaped inside it. If you use the Ascii quotation mark "as delimiter, it is customary to escape its occurrences using &quot;whereas for the Ascii apostrophe, the entity reference &apos;is defined in some HTML versions only, so it it safest to use the numeric reference &#39;(or &#x27;).
  • 小于字符<应该被转义。通常&lt;用于此。从技术上讲,根据 HTML 版本,转义并不总是必需的,但一直是很好的做法。
  • &符号&应该被转义。通常&amp;用于此。这也并不总是强制性的,但总是这样做比在需要时学习和记住更简单。
  • 用作属性值周围分隔符的字符必须在其中进行转义。如果您使用 Ascii 引号"作为分隔符,通常会使用它来转义它的出现,&quot;而对于 Ascii 撇号,实体引用&apos;仅在某些 HTML 版本中定义,因此使用数字引用&#39;(或&#x27;)最安全。

You can escape >(or any other data character) if you like, but it is never needed.

>如果您愿意,您可以转义(或任何其他数据字符),但从不需要它。

On the JavaScript side, there are some escape mechanisms (with \) in string literals. But these are a different issue, and not relevant in your case.

在 JavaScript 方面,\字符串文字中有一些转义机制(带有)。但这些是不同的问题,与您的情况无关。

In your example, on a browser that conforms to current specifications, the JavaScript interpreter sees exactly the same code alert('Hello');. The browser has “unescaped” &apos;or &#39;to '. I was somewhat surprised to here that &apos;is not universally supported these days, but it's not an issue: there is seldom any need to escape the Ascii apostrophe in HTML (escaping is only needed within attribute values and only if you use the Ascii apostrophe as its delimiter), and when there is, you can use the &#39;reference.

在您的示例中,在符合当前规范的浏览器上,JavaScript 解释器看到的代码完全相同alert('Hello');。浏览器已“未转义”&apos;&#39;'. 我对此感到有些惊讶,&apos;这些天并未得到普遍支持,但这不是问题:很少需要转义 HTML 中的 Ascii 撇号(仅在属性值内需要转义,并且仅当您使用 Ascii 撇号作为其分隔符),如果有,您可以使用&#39;引用。

回答by Myforwik

&apos;is not a valid HTML reference entity. You should escape using &#39;

&apos;不是有效的HTML 参考实体。你应该逃避使用&#39;