解码& 回到 & 在 JavaScript 中

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3700326/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-23 05:45:19  来源:igfitidea点击:

Decode & back to & in JavaScript

javascripthtmltextdecode

提问by Art

I have strings like

我有像这样的字符串

var str = 'One & two & three';

rendered into HTML by the web server. I need to transform those strings into

由 Web 服务器呈现为 HTML。我需要将这些字符串转换为

'One & two & three'

Currently, that's what I am doing (with help of jQuery):

目前,这就是我正在做的事情(在 jQuery 的帮助下):

$(document.createElement('div')).html('{{ driver.person.name }}').text()

However I have an unsettling feeling that I am doing it wrong. I have tried

然而,我有一种不安的感觉,我做错了。我试过了

unescape("&")

but it doesn't seem to work, neither do decodeURI/decodeURIComponent.

但它似乎不起作用,decodeURI/decodeURIComponent 也不起作用。

Are there any other, more native and elegant ways of doing so?

有没有其他更原生、更优雅的方式来做到这一点?

采纳答案by Facebook Staff are Complicit

A more modern option for interpreting HTML (text and otherwise) from JavaScript is the HTML support in the DOMParserAPI (see here in MDN). This allows you to use the browser's native HTML parser to convert a string to an HTML document. It has been supported in new versions of all major browsers since late 2014.

从 JavaScript 解释 HTML(文本和其他)的一个更现代的选项是DOMParserAPI 中的 HTML 支持(参见 MDN 中的此处)。这允许您使用浏览器的本机 HTML 解析器将字符串转换为 HTML 文档。自 2014 年底以来,所有主要浏览器的新版本都支持它。

If we just want to decode some text content, we can put it as the sole content in a document body, parse the document, and pull out the its .body.textContent.

如果我们只是想解码一些文本内容,我们可以将其作为文档正文中的唯一内容,解析文档,然后将其提取出来.body.textContent

var encodedStr = 'hello & world';

var parser = new DOMParser;
var dom = parser.parseFromString(
    '<!doctype html><body>' + encodedStr,
    'text/html');
var decodedString = dom.body.textContent;

console.log(decodedString);

We can see in the draft specification for DOMParserthat JavaScript is not enabled for the parsed document, so we can perform this text conversion without security concerns.

我们可以在规范草案中DOMParser看到,没有为解析的文档启用 JavaScript,因此我们可以在没有安全问题的情况下执行此文本转换。

The parseFromString(str, type)method must run these steps, depending on type:

  • "text/html"

    Parse strwith an HTML parser, and return the newly created Document.

    The scripting flag must be set to "disabled".

    NOTE

    scriptelements get marked unexecutable and the contents of noscriptget parsed as markup.

parseFromString(str, type)方法必须运行这些步骤,具体取决于类型

  • "text/html"

    解析海峡HTML parser,并返回新创建的Document

    脚本标志必须设置为“禁用”。

    笔记

    script元素被标记为不可执行,内容noscript被解析为标记。

It's beyond the scope of this question, but please notethat if you're taking the parsed DOM nodes themselves (not just their text content) and moving them to the live document DOM, it's possible that their scripting would be reenabled, and there could be security concerns. I haven't researched it, so please exercise caution.

这超出了这个问题的范围,但请注意,如果您将解析的 DOM 节点本身(不仅仅是它们的文本内容)移动到实时文档 DOM,则可能会重新启用它们的脚本,并且可能是安全问题。我没有研究过,所以请谨慎行事。

回答by LukeH

Do you need to decode all encoded HTML entities or just &amp;itself?

你需要解码所有编码的 HTML 实体还是只解码&amp;它本身?

If you only need to handle &amp;then you can do this:

如果你只需要处理,&amp;那么你可以这样做:

var decoded = encoded.replace(/&amp;/g, '&');

If you need to decode all HTML entities then you can do it without jQuery:

如果您需要解码所有 HTML 实体,那么您可以在没有 jQuery 的情况下完成:

var elem = document.createElement('textarea');
elem.innerHTML = encoded;
var decoded = elem.value;

Please take note of Mark's comments below which highlight security holes in an earlier version of this answer and recommend using textarearather than divto mitigate against potential XSS vulnerabilities. These vulnerabilities exist whether you use jQuery or plain JavaScript.

请注意下面 Mark 的评论,这些评论强调了此答案早期版本中的安全漏洞,并建议使用textarea而不是div缓解潜在的 XSS 漏洞。无论您使用 jQuery 还是普通的 JavaScript,这些漏洞都存在。

回答by Mark Amery

Matthias Bynens has a library for this: https://github.com/mathiasbynens/he

Matthias Bynens 有一个库:https: //github.com/mathiasbynens/he

Example:

例子:

console.log(
    he.decode("J&#246;rg &amp J&#xFC;rgen rocked to &amp; fro ")
);
// Logs "J?rg & Jürgen rocked to & fro"

I suggest favouring it over hacks involving setting an element's HTML content and then reading back its text content. Such approaches can work, but are deceptively dangerous and present XSS opportunities if used on untrusted user input.

我建议支持它而不是涉及设置元素的 HTML 内容然后读回其文本内容的 hack。这种方法可以工作,但如果用于不受信任的用户输入,则具有欺骗性的危险并存在 XSS 机会。

If you really can't bear to load in a library, you can use the textareahack described in this answerto a near-duplicate question, which, unlike various similar approaches that have been suggested, has no security holes that I know of:

如果您真的不忍心加载库,则可以使用此答案中textarea描述的hack解决一个几乎重复的问题,与建议的各种类似方法不同,它没有我所知道的安全漏洞:

function decodeEntities(encodedString) {
    var textArea = document.createElement('textarea');
    textArea.innerHTML = encodedString;
    return textArea.value;
}

console.log(decodeEntities('1 &amp; 2')); // '1 & 2'

But take note of the security issues, affecting similar approaches to this one, that I list in the linked answer! This approach is a hack, and future changes to the permissible content of a textarea(or bugs in particular browsers) could lead to code that relies upon it suddenly having an XSS hole one day.

但是请注意安全问题,影响与此方法类似的方法,我在链接的答案中列出了这些问题!这种方法是一种黑客行为,未来对 a textarea(或特定浏览器中的错误)允许内容的更改可能会导致依赖它的代码有一天突然出现 XSS 漏洞。

回答by WaiKit Kung

var htmlEnDeCode = (function() {
    var charToEntityRegex,
        entityToCharRegex,
        charToEntity,
        entityToChar;

    function resetCharacterEntities() {
        charToEntity = {};
        entityToChar = {};
        // add the default set
        addCharacterEntities({
            '&amp;'     :   '&',
            '&gt;'      :   '>',
            '&lt;'      :   '<',
            '&quot;'    :   '"',
            '&#39;'     :   "'"
        });
    }

    function addCharacterEntities(newEntities) {
        var charKeys = [],
            entityKeys = [],
            key, echar;
        for (key in newEntities) {
            echar = newEntities[key];
            entityToChar[key] = echar;
            charToEntity[echar] = key;
            charKeys.push(echar);
            entityKeys.push(key);
        }
        charToEntityRegex = new RegExp('(' + charKeys.join('|') + ')', 'g');
        entityToCharRegex = new RegExp('(' + entityKeys.join('|') + '|&#[0-9]{1,5};' + ')', 'g');
    }

    function htmlEncode(value){
        var htmlEncodeReplaceFn = function(match, capture) {
            return charToEntity[capture];
        };

        return (!value) ? value : String(value).replace(charToEntityRegex, htmlEncodeReplaceFn);
    }

    function htmlDecode(value) {
        var htmlDecodeReplaceFn = function(match, capture) {
            return (capture in entityToChar) ? entityToChar[capture] : String.fromCharCode(parseInt(capture.substr(2), 10));
        };

        return (!value) ? value : String(value).replace(entityToCharRegex, htmlDecodeReplaceFn);
    }

    resetCharacterEntities();

    return {
        htmlEncode: htmlEncode,
        htmlDecode: htmlDecode
    };
})();

This is from ExtJS source code.

这是来自 ExtJS 源代码。

回答by avg_joe

element.innerTextalso does the trick.

element.innerText也有诀窍。

回答by I am L

You can use Lodash unescape / escape function https://lodash.com/docs/4.17.5#unescape

您可以使用 Lodash unescape/escape 功能https://lodash.com/docs/4.17.5#unescape

import unescape from 'lodash/unescape';

const str = unescape('fred, barney, &amp; pebbles');

str will become 'fred, barney, & pebbles'

str 将成为 'fred, barney, & pebbles'

回答by cslotty

In case you're looking for it, like me - meanwhile there's a nice and safe JQuery method.

如果您正在寻找它,就像我一样 - 同时有一个很好且安全的 JQuery 方法。

https://api.jquery.com/jquery.parsehtml/

https://api.jquery.com/jquery.parsehtml/

You can f.ex. type this in your console:

你可以 在你的控制台中输入:

var x = "test &amp;";
> undefined
$.parseHTML(x)[0].textContent
> "test &"

So $.parseHTML(x) returns an array, and if you have HTML markup within your text, the array.length will be greater than 1.

所以 $.parseHTML(x) 返回一个数组,如果你的文本中有 HTML 标记,array.length 将大于 1。

回答by Jason Williams

jQuery will encode and decode for you. However, you need to use a textarea tag, not a div.

jQuery 将为您编码和解码。但是,您需要使用 textarea 标签,而不是 div。

var str1 = 'One & two & three';
var str2 = "One &amp; two &amp; three";
  
$(document).ready(function() {
   $("#encoded").text(htmlEncode(str1)); 
   $("#decoded").text(htmlDecode(str2));
});

function htmlDecode(value) {
  return $("<textarea/>").html(value).text();
}

function htmlEncode(value) {
  return $('<textarea/>').text(value).html();
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>

<div id="encoded"></div>
<div id="decoded"></div>

回答by Infoglaze.com

First create a <span id="decodeIt" style="display:none;"></span>somewhere in the body

首先<span id="decodeIt" style="display:none;"></span>在身体的某个地方创建一个

Next, assign the string to be decoded as innerHTML to this:

接下来,将要解码为innerHTML的字符串分配给:

document.getElementById("decodeIt").innerHTML=stringtodecode

Finally,

最后,

stringtodecode=document.getElementById("decodeIt").innerText

Here is the overall code:

下面是整体代码:

var stringtodecode="<B>Hello</B> world<br>";
document.getElementById("decodeIt").innerHTML=stringtodecode;
stringtodecode=document.getElementById("decodeIt").innerText

回答by Peter Brandt

a javascript solution that catches the common ones:

捕获常见问题的 javascript 解决方案:

var map = {amp: '&', lt: '<', gt: '>', quot: '"', '#039': "'"}
str = str.replace(/&([^;]+);/g, (m, c) => map[c])

this is the reverse of https://stackoverflow.com/a/4835406/2738039

这是https://stackoverflow.com/a/4835406/2738039的反面