Javascript HTML 实体解码
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5796718/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
HTML Entity Decode
提问by chris
How do I encode and decode HTML entities using JavaScript or JQuery?
如何使用 JavaScript 或 JQuery 对 HTML 实体进行编码和解码?
var varTitle = "Chris' corner";
I want it to be:
我希望它是:
var varTitle = "Chris' corner";
采纳答案by David says reinstate Monica
You could try something like:
你可以尝试这样的事情:
var Title = $('<textarea />').html("Chris' corner").text();
console.log(Title);
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
A more interactive version:
更具互动性的版本:
$('form').submit(function() {
var theString = $('#string').val();
var varTitle = $('<textarea />').html(theString).text();
$('#output').text(varTitle);
return false;
});
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<form action="#" method="post">
<fieldset>
<label for="string">Enter a html-encoded string to decode</label>
<input type="text" name="string" id="string" />
</fieldset>
<fieldset>
<input type="submit" value="decode" />
</fieldset>
</form>
<div id="output"></div>
回答by Robert K
I recommend against using the jQuery code that was accepted as the answer. While it does not insert the string to decode into the page, it does cause things such as scripts and HTML elements to get created. This is way more code than we need. Instead, I suggest using a safer, more optimized function.
我建议不要使用被接受为答案的 jQuery 代码。虽然它不会将要解码的字符串插入到页面中,但它确实会导致诸如脚本和 HTML 元素之类的内容被创建。这是比我们需要的更多的代码。相反,我建议使用更安全、更优化的函数。
var decodeEntities = (function() {
// this prevents any overhead from creating the object each time
var element = document.createElement('div');
function decodeHTMLEntities (str) {
if(str && typeof str === 'string') {
// strip script/html tags
str = str.replace(/<script[^>]*>([\S\s]*?)<\/script>/gmi, '');
str = str.replace(/<\/?\w(?:[^"'>]|"[^"]*"|'[^']*')*>/gmi, '');
element.innerHTML = str;
str = element.textContent;
element.textContent = '';
}
return str;
}
return decodeHTMLEntities;
})();
To use this function, just call decodeEntities("&")
and it will use the same underlying techniques as the jQuery version will—but without jQuery's overhead, and after sanitizing the HTML tags in the input. See Mike Samuel's commenton the accepted answer for how to filter out HTML tags.
要使用此函数,只需调用decodeEntities("&")
它,它将使用与 jQuery 版本相同的底层技术——但没有 jQuery 的开销,并且在清理输入中的 HTML 标签之后。有关如何过滤掉 HTML 标记的已接受答案,请参阅Mike Samuel 的评论。
This function can be easily used as a jQuery plugin by adding the following line in your project.
通过在项目中添加以下行,可以轻松地将此函数用作 jQuery 插件。
jQuery.decodeEntities = decodeEntities;
回答by Alan Hamlett
Like Robert K said, don't use jQuery.html().text() to decode html entities as it's unsafe because user input should never have access to the DOM. Read about XSSfor why this is unsafe.
就像 Robert K 所说的那样,不要使用 jQuery.html().text() 来解码 html 实体,因为它是不安全的,因为用户输入永远不应该访问 DOM。阅读XSS以了解为什么这是不安全的。
Instead try the Underscore.jsutility-belt library which comes with escapeand unescapemethods:
而是尝试使用带有转义和转义方法的Underscore.js实用程序带库:
Escapes a string for insertion into HTML, replacing &
, <
, >
, "
, `
, and '
characters.
逸出用于插入HTML,替换字符串&
,<
,>
,"
,`
,和'
字符。
_.escape('Curly, Larry & Moe');
=> "Curly, Larry & Moe"
The opposite of escape, replaces &
, <
, >
, "
, `
and '
with their unescaped counterparts.
逃跑的对面,更换&
,<
,>
,"
,`
和'
与他们同行的转义。
_.unescape('Curly, Larry & Moe');
=> "Curly, Larry & Moe"
To support decoding more characters, just copy the Underscore unescapemethod and add more characters to the map.
要支持解码更多字符,只需复制 Underscore unescape方法并将更多字符添加到映射中即可。
回答by William Lahti
Here's a quick method that doesn't require creating a div, and decodes the "most common" HTML escaped chars:
这是一个不需要创建 div 的快速方法,它可以对“最常见”的 HTML 转义字符进行解码:
function decodeHTMLEntities(text) {
var entities = [
['amp', '&'],
['apos', '\''],
['#x27', '\''],
['#x2F', '/'],
['#39', '\''],
['#47', '/'],
['lt', '<'],
['gt', '>'],
['nbsp', ' '],
['quot', '"']
];
for (var i = 0, max = entities.length; i < max; ++i)
text = text.replace(new RegExp('&'+entities[i][0]+';', 'g'), entities[i][1]);
return text;
}
回答by insign
This is my favourite way of decoding HTML characters. The advantage of using this code is that tags are also preserved.
这是我最喜欢的解码 HTML 字符的方式。使用此代码的优点是还保留了标签。
function decodeHtml(html) {
var txt = document.createElement("textarea");
txt.innerHTML = html;
return txt.value;
}
Example: http://jsfiddle.net/k65s3/
示例:http: //jsfiddle.net/k65s3/
Input:
输入:
Entity: Bad attempt at XSS:<script>alert('new\nline?')</script><br>
Output:
输出:
Entity:?Bad attempt at XSS:<script>alert('new\nline?')</script><br>
回答by mattcasey
Inspired by Robert K's solution, this version does not strip HTML tags, and is just as secure.
受 Robert K 解决方案的启发,此版本不会剥离 HTML 标签,并且同样安全。
var decode_entities = (function() {
// Remove HTML Entities
var element = document.createElement('div');
function decode_HTML_entities (str) {
if(str && typeof str === 'string') {
// Escape HTML before decoding for HTML Entities
str = escape(str).replace(/%26/g,'&').replace(/%23/g,'#').replace(/%3B/g,';');
element.innerHTML = str;
if(element.innerText){
str = element.innerText;
element.innerText = '';
}else{
// Firefox support
str = element.textContent;
element.textContent = '';
}
}
return unescape(str);
}
return decode_HTML_entities;
})();
回答by Mirodil
here is another version:
这是另一个版本:
function convertHTMLEntity(text){
const span = document.createElement('span');
return text
.replace(/&[#A-Za-z0-9]+;/gi, (entity,position,text)=> {
span.innerHTML = entity;
return span.innerText;
});
}
console.log(convertHTMLEntity('Large < £ 500'));
回答by Jason Williams
jQuery provides a way to encode and decode html entities.
jQuery 提供了一种编码和解码 html 实体的方法。
If you use a "<div/>" tag, it will strip out all the html.
如果您使用“<div/>”标签,它将去除所有的 html。
function htmlDecode(value) {
return $("<div/>").html(value).text();
}
function htmlEncode(value) {
return $('<div/>').text(value).html();
}
If you use a "<textarea/>" tag, it will preserve the html tags.
如果您使用“<textarea/>”标签,它将保留 html 标签。
function htmlDecode(value) {
return $("<textarea/>").html(value).text();
}
function htmlEncode(value) {
return $('<textarea/>').text(value).html();
}
回答by Tyler Johnson
To add yet another "inspired by Robert K" to the list, here is another safeversion which does not strip HTML tags. Instead of running the whole string through the HTML parser, it pulls out only the entities and converts those.
要将另一个“受 Robert K 启发”添加到列表中,这里是另一个不剥离 HTML 标签的安全版本。它不是通过 HTML 解析器运行整个字符串,而是仅提取实体并转换它们。
var decodeEntities = (function() {
// this prevents any overhead from creating the object each time
var element = document.createElement('div');
// regular expression matching HTML entities
var entity = /&(?:#x[a-f0-9]+|#[0-9]+|[a-z0-9]+);?/ig;
return function decodeHTMLEntities(str) {
// find and replace all the html entities
str = str.replace(entity, function(m) {
element.innerHTML = m;
return element.textContent;
});
// reset the value
element.textContent = '';
return str;
}
})();
回答by VyvIT
Inspired by Robert K's solution, strips html tags and prevents executing scriptsand eventhandlers like: <img src=fake onerror="prompt(1)">
Tested on latest Chrome, FF, IE (should work from IE9, but haven't tested).
受 Robert K 解决方案的启发,去除 html 标签并阻止执行脚本和事件处理程序,例如:<img src=fake onerror="prompt(1)">
在最新的 Chrome、FF、IE 上测试(应该在 IE9 上工作,但尚未测试)。
var decodeEntities = (function () {
//create a new html document (doesn't execute script tags in child elements)
var doc = document.implementation.createHTMLDocument("");
var element = doc.createElement('div');
function getText(str) {
element.innerHTML = str;
str = element.textContent;
element.textContent = '';
return str;
}
function decodeHTMLEntities(str) {
if (str && typeof str === 'string') {
var x = getText(str);
while (str !== x) {
str = x;
x = getText(x);
}
return x;
}
}
return decodeHTMLEntities;
})();
Simply call:
只需调用:
decodeEntities('<img src=fake onerror="prompt(1)">');
decodeEntities("<script>alert('aaa!')</script>");