Javascript 解码包含特殊 HTML 实体的字符串的正确方法是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7394748/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What's the right way to decode a string that has special HTML entities in it?
提问by Dan Tao
Say I get some JSON back from a service request that looks like this:
假设我从如下所示的服务请求中获取了一些 JSON:
{
"message": "We're unable to complete your request at this time."
}
I'm not sure whythat apostraphe is encoded like that ('
); all I know is that I want to decode it.
我不知道为什么那个撇号是这样编码的('
);我只知道我想解码它。
Here's one approach using jQuery that popped into my head:
这是一种使用 jQuery 的方法,它突然出现在我的脑海中:
function decodeHtml(html) {
return $('<div>').html(html).text();
}
That seems (very) hacky, though. What's a better way? Is there a "right" way?
不过,这似乎(非常)hacky。什么是更好的方法?有“正确”的方法吗?
回答by Rob W
This is my favourite way of decoding HTML characters. The advantage of using this code is that tags are also preserved.
这是我最喜欢的解码 HTML 字符的方式。使用此代码的优点是还保留了标签。
function decodeHtml(html) {
var txt = document.createElement("textarea");
txt.innerHTML = html;
return txt.value;
}
Example: http://jsfiddle.net/k65s3/
示例:http: //jsfiddle.net/k65s3/
Input:
输入:
Entity: Bad attempt at XSS:<script>alert('new\nline?')</script><br>
Output:
输出:
Entity:?Bad attempt at XSS:<script>alert('new\nline?')</script><br>
回答by Mathias Bynens
Don't use the DOM to do this.Using the DOM to decode HTML entities (as suggested in the currently accepted answer) leads to differences in cross-browser results.
不要使用 DOM 来执行此操作。使用 DOM 解码 HTML 实体(如当前接受的答案中所建议的)会导致跨浏览器结果的差异。
For a robust & deterministic solution that decodes character references according to the algorithm in the HTML Standard, use the helibrary. From its README:
对于根据HTML标准的算法解码字符引用一个强大的和确定性的解决方案,使用了他的库。从它的自述文件:
he(for “HTML entities”) is a robust HTML entity encoder/decoder written in JavaScript. It supports all standardized named character references as per HTML, handles ambiguous ampersandsand other edge cases just like a browser would, has an extensive test suite, and — contrary to many other JavaScript solutions — hehandles astral Unicode symbols just fine. An online demo is available.
he(代表“HTML 实体”)是一个用 JavaScript 编写的强大的 HTML 实体编码器/解码器。它支持所有标准化的 HTML 命名字符引用,像浏览器一样处理不明确的&符号和其他边缘情况,具有广泛的测试套件,并且——与许多其他 JavaScript 解决方案相反——他可以很好地处理星体 Unicode 符号。提供在线演示。
Here's how you'd use it:
以下是您如何使用它:
he.decode("We're unable to complete your request at this time.");
→ "We're unable to complete your request at this time."
Disclaimer: I'm the author of the helibrary.
免责声明:我是he库的作者。
See this Stack Overflow answerfor some more info.
有关更多信息,请参阅此堆栈溢出答案。
回答by Alxandr
If you don't want to use html/dom, you could use regex. I haven't tested this; but something along the lines of:
如果你不想使用 html/dom,你可以使用正则表达式。我没有测试过这个;但大致如下:
function parseHtmlEntities(str) {
return str.replace(/&#([0-9]{1,3});/gi, function(match, numStr) {
var num = parseInt(numStr, 10); // read num as normal number
return String.fromCharCode(num);
});
}
[Edit]
[编辑]
Note: this would only work for numeric html-entities, and not stuff like &oring;.
注意:这仅适用于数字 html 实体,而不适用于 &oring; 之类的东西。
[Edit 2]
[编辑 2]
Fixed the function (some typos), test here: http://jsfiddle.net/Be2Bd/1/
修复了功能(一些错别字),在这里测试:http: //jsfiddle.net/Be2Bd/1/
回答by Jason Williams
jQuery will encode and decode for you.
jQuery 将为您编码和解码。
function htmlDecode(value) {
return $("<textarea/>").html(value).text();
}
function htmlEncode(value) {
return $('<textarea/>').text(value).html();
}
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.9.1/jquery.min.js"></script>
<script>
$(document).ready(function() {
$("#encoded")
.text(htmlEncode("<img src onerror='alert(0)'>"));
$("#decoded")
.text(htmlDecode("<img src onerror='alert(0)'>"));
});
</script>
<span>htmlEncode() result:</span><br/>
<div id="encoded"></div>
<br/>
<span>htmlDecode() result:</span><br/>
<div id="decoded"></div>
回答by hypers
There's JS function to deal with &#xxxxstyled entities:
function at GitHub
有处理&#xxxx样式实体的 JS 函数:
GitHub 上的函数
// encode(decode) html text into html entity
var decodeHtmlEntity = function(str) {
return str.replace(/&#(\d+);/g, function(match, dec) {
return String.fromCharCode(dec);
});
};
var encodeHtmlEntity = function(str) {
var buf = [];
for (var i=str.length-1;i>=0;i--) {
buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
}
return buf.join('');
};
var entity = '高级程序设计';
var str = '高级程序设计';
console.log(decodeHtmlEntity(entity) === str);
console.log(encodeHtmlEntity(str) === entity);
// output:
// true
// true
回答by tldr
回答by kodmanyagha
This is so good answer. You can use this with angular like this:
这是一个很好的答案。您可以像这样使用 angular:
moduleDefinitions.filter('sanitize', ['$sce', function($sce) {
return function(htmlCode) {
var txt = document.createElement("textarea");
txt.innerHTML = htmlCode;
return $sce.trustAsHtml(txt.value);
}
}]);