Javascript 将 HTML 标签转义为 HTML 实体的最快方法?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5499078/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Fastest method to escape HTML tags as HTML entities?
提问by callum
I'm writing a Chrome extension that involves doing a lotof the following job: sanitizing strings that mightcontain HTML tags, by converting <
, >
and &
to <
, >
and &
, respectively.
我正在写一个Chrome扩展程序,包括做了很多以后的工作中:消毒的字符串可能包含HTML标签,通过转换<
,>
并&
以<
,>
和&
分别。
(In other words, the same as PHP's htmlspecialchars(str, ENT_NOQUOTES)
– I don't think there's any real need to convert double-quote characters.)
(换句话说,与 PHP 相同htmlspecialchars(str, ENT_NOQUOTES)
——我认为没有任何真正需要转换双引号字符。)
This is the fastest function I have found so far:
这是迄今为止我发现的最快的功能:
function safe_tags(str) {
return str.replace(/&/g,'&').replace(/</g,'<').replace(/>/g,'>') ;
}
But there's still a big lag when I have to run a few thousand strings through it in one go.
但是当我必须一次运行几千个字符串时,仍然有很大的滞后。
Can anyone improve on this? It's mostly for strings between 10 and 150 characters, if that makes a difference.
任何人都可以改进吗?它主要用于 10 到 150 个字符之间的字符串,如果这有所不同的话。
(One idea I had was not to bother encoding the greater-than sign – would there be any real danger with that?)
(我的一个想法是不要对大于号进行编码——这样做会有什么真正的危险吗?)
采纳答案by Martijn
You could try passing a callback function to perform the replacement:
您可以尝试传递一个回调函数来执行替换:
var tagsToReplace = {
'&': '&',
'<': '<',
'>': '>'
};
function replaceTag(tag) {
return tagsToReplace[tag] || tag;
}
function safe_tags_replace(str) {
return str.replace(/[&<>]/g, replaceTag);
}
Here is a performance test: http://jsperf.com/encode-html-entitiesto compare with calling the replace
function repeatedly, and using the DOM method proposed by Dmitrij.
这里是一个性能测试:http: //jsperf.com/encode-html-entities与replace
重复调用函数进行比较,并使用 Dmitrij 提出的 DOM 方法。
Your way seems to be faster...
你的方式似乎更快...
Why do you need it, though?
不过,你为什么需要它?
回答by Web_Designer
Here's one way you can do this:
这是您可以执行此操作的一种方法:
var escape = document.createElement('textarea');
function escapeHTML(html) {
escape.textContent = html;
return escape.innerHTML;
}
function unescapeHTML(html) {
escape.innerHTML = html;
return escape.textContent;
}
回答by Aram Kocharyan
Martijn's method as a prototype function:
Martijn 的方法作为原型函数:
String.prototype.escape = function() {
var tagsToReplace = {
'&': '&',
'<': '<',
'>': '>'
};
return this.replace(/[&<>]/g, function(tag) {
return tagsToReplace[tag] || tag;
});
};
var a = "<abc>";
var b = a.escape(); // "<abc>"
回答by Julien Kronegg
The fastest method is:
最快的方法是:
function escapeHTML(html) {
return document.createElement('div').appendChild(document.createTextNode(html)).parentNode.innerHTML;
}
This method is about twice faster than the methods based on 'replace', see http://jsperf.com/htmlencoderegex/35.
这种方法比基于“replace”的方法快两倍,参见http://jsperf.com/htmlencoderegex/35。
回答by Todd
An even quicker/shorter solution is:
一个更快/更短的解决方案是:
escaped = new Option(html).innerHTML
This is related to some weird vestige of JavaScript whereby the Option element retains a constructor that does this sort of escaping automatically.
这与 JavaScript 的一些奇怪的痕迹有关,其中 Option 元素保留了一个自动执行此类转义的构造函数。
回答by Kevin Hakanson
The AngularJS source code also has a version inside of angular-sanitize.js.
AngularJS 源代码在angular-sanitize.js 中也有一个版本。
var SURROGATE_PAIR_REGEXP = /[\uD800-\uDBFF][\uDC00-\uDFFF]/g,
// Match everything outside of normal chars and " (quote character)
NON_ALPHANUMERIC_REGEXP = /([^\#-~| |!])/g;
/**
* Escapes all potentially dangerous characters, so that the
* resulting string can be safely inserted into attribute or
* element text.
* @param value
* @returns {string} escaped text
*/
function encodeEntities(value) {
return value.
replace(/&/g, '&').
replace(SURROGATE_PAIR_REGEXP, function(value) {
var hi = value.charCodeAt(0);
var low = value.charCodeAt(1);
return '&#' + (((hi - 0xD800) * 0x400) + (low - 0xDC00) + 0x10000) + ';';
}).
replace(NON_ALPHANUMERIC_REGEXP, function(value) {
return '&#' + value.charCodeAt(0) + ';';
}).
replace(/</g, '<').
replace(/>/g, '>');
}
回答by baptx
All-in-one script:
多合一脚本:
// HTML entities Encode/Decode
function htmlspecialchars(str) {
var map = {
"&": "&",
"<": "<",
">": ">",
"\"": """,
"'": "'" // ' -> ' for XML only
};
return str.replace(/[&<>"']/g, function(m) { return map[m]; });
}
function htmlspecialchars_decode(str) {
var map = {
"&": "&",
"<": "<",
">": ">",
""": "\"",
"'": "'"
};
return str.replace(/(&|<|>|"|')/g, function(m) { return map[m]; });
}
function htmlentities(str) {
var textarea = document.createElement("textarea");
textarea.innerHTML = str;
return textarea.innerHTML;
}
function htmlentities_decode(str) {
var textarea = document.createElement("textarea");
textarea.innerHTML = str;
return textarea.value;
}
回答by Dave Brown
function encode(r) {
return r.replace(/[\x26\x0A\x3c\x3e\x22\x27]/g, function(r) {
return "&#" + r.charCodeAt(0) + ";";
});
}
test.value=encode('How to encode\nonly html tags &<>\'" nice & fast!');
/*
\x26 is &ersand (it has to be first),
\x0A is newline,
\x22 is ",
\x27 is ',
\x3c is <,
\x3e is >
*/
<textarea id=test rows=11 cols=55>www.WHAK.com</textarea>
回答by iman
Martijn's method as single function with handling "mark (using in javascript) :
Martijn 的方法作为单个函数处理“标记(在 javascript 中使用):
function escapeHTML(html) {
var fn=function(tag) {
var charsToReplace = {
'&': '&',
'<': '<',
'>': '>',
'"': '"'
};
return charsToReplace[tag] || tag;
}
return html.replace(/[&<>"]/g, fn);
}