Javascript 将 HTML 标签转义为 HTML 实体的最快方法？

Question

提问by callum

I'm writing a Chrome extension that involves doing a lotof the following job: sanitizing strings that mightcontain HTML tags, by converting <, >and &to <, >and &, respectively.

我正在写一个Chrome扩展程序，包括做了很多以后的工作中：消毒的字符串可能包含HTML标签，通过转换<，>并&以<，>和&分别。

(In other words, the same as PHP's htmlspecialchars(str, ENT_NOQUOTES)– I don't think there's any real need to convert double-quote characters.)

（换句话说，与 PHP 相同htmlspecialchars(str, ENT_NOQUOTES)——我认为没有任何真正需要转换双引号字符。）

This is the fastest function I have found so far:

这是迄今为止我发现的最快的功能：

function safe_tags(str) {
    return str.replace(/&/g,'&amp;').replace(/</g,'&lt;').replace(/>/g,'&gt;') ;
}

But there's still a big lag when I have to run a few thousand strings through it in one go.

但是当我必须一次运行几千个字符串时，仍然有很大的滞后。

Can anyone improve on this? It's mostly for strings between 10 and 150 characters, if that makes a difference.

任何人都可以改进吗？它主要用于 10 到 150 个字符之间的字符串，如果这有所不同的话。

(One idea I had was not to bother encoding the greater-than sign – would there be any real danger with that?)

（我的一个想法是不要对大于号进行编码——这样做会有什么真正的危险吗？）

Answer 1

采纳答案by Martijn

You could try passing a callback function to perform the replacement:

您可以尝试传递一个回调函数来执行替换：

var tagsToReplace = {
    '&': '&amp;',
    '<': '&lt;',
    '>': '&gt;'
};

function replaceTag(tag) {
    return tagsToReplace[tag] || tag;
}

function safe_tags_replace(str) {
    return str.replace(/[&<>]/g, replaceTag);
}

Here is a performance test: http://jsperf.com/encode-html-entitiesto compare with calling the replacefunction repeatedly, and using the DOM method proposed by Dmitrij.

这里是一个性能测试：http: //jsperf.com/encode-html-entities与replace重复调用函数进行比较，并使用 Dmitrij 提出的 DOM 方法。

Your way seems to be faster...

你的方式似乎更快...

Why do you need it, though?

不过，你为什么需要它？

Answer 2

回答by Web_Designer

Here's one way you can do this:

这是您可以执行此操作的一种方法：

var escape = document.createElement('textarea');
function escapeHTML(html) {
    escape.textContent = html;
    return escape.innerHTML;
}

function unescapeHTML(html) {
    escape.innerHTML = html;
    return escape.textContent;
}

Here's a demo.

这是一个演示。

Answer 3

回答by Aram Kocharyan

Martijn's method as a prototype function:

Martijn 的方法作为原型函数：

String.prototype.escape = function() {
    var tagsToReplace = {
        '&': '&amp;',
        '<': '&lt;',
        '>': '&gt;'
    };
    return this.replace(/[&<>]/g, function(tag) {
        return tagsToReplace[tag] || tag;
    });
};

var a = "<abc>";
var b = a.escape(); // "&lt;abc&gt;"

Answer 4

回答by Julien Kronegg

The fastest method is:

最快的方法是：

function escapeHTML(html) {
    return document.createElement('div').appendChild(document.createTextNode(html)).parentNode.innerHTML;
}

This method is about twice faster than the methods based on 'replace', see http://jsperf.com/htmlencoderegex/35.

这种方法比基于“replace”的方法快两倍，参见http://jsperf.com/htmlencoderegex/35。

Source: https://stackoverflow.com/a/17546215/698168

来源：https: //stackoverflow.com/a/17546215/698168

Answer 5

回答by Todd

An even quicker/shorter solution is:

一个更快/更短的解决方案是：

escaped = new Option(html).innerHTML

This is related to some weird vestige of JavaScript whereby the Option element retains a constructor that does this sort of escaping automatically.

这与 JavaScript 的一些奇怪的痕迹有关，其中 Option 元素保留了一个自动执行此类转义的构造函数。

Credit to https://github.com/jasonmoo/t.js/blob/master/t.js

归功于https://github.com/jasonmoo/t.js/blob/master/t.js

Answer 6

回答by Kevin Hakanson

The AngularJS source code also has a version inside of angular-sanitize.js.

AngularJS 源代码在angular-sanitize.js 中也有一个版本。

var SURROGATE_PAIR_REGEXP = /[\uD800-\uDBFF][\uDC00-\uDFFF]/g,
    // Match everything outside of normal chars and " (quote character)
    NON_ALPHANUMERIC_REGEXP = /([^\#-~| |!])/g;
/**
 * Escapes all potentially dangerous characters, so that the
 * resulting string can be safely inserted into attribute or
 * element text.
 * @param value
 * @returns {string} escaped text
 */
function encodeEntities(value) {
  return value.
    replace(/&/g, '&amp;').
    replace(SURROGATE_PAIR_REGEXP, function(value) {
      var hi = value.charCodeAt(0);
      var low = value.charCodeAt(1);
      return '&#' + (((hi - 0xD800) * 0x400) + (low - 0xDC00) + 0x10000) + ';';
    }).
    replace(NON_ALPHANUMERIC_REGEXP, function(value) {
      return '&#' + value.charCodeAt(0) + ';';
    }).
    replace(/</g, '&lt;').
    replace(/>/g, '&gt;');
}

Answer 7

回答by baptx

All-in-one script:

多合一脚本：

// HTML entities Encode/Decode

function htmlspecialchars(str) {
    var map = {
        "&": "&amp;",
        "<": "&lt;",
        ">": "&gt;",
        "\"": "&quot;",
        "'": "&#39;" // ' -> &apos; for XML only
    };
    return str.replace(/[&<>"']/g, function(m) { return map[m]; });
}
function htmlspecialchars_decode(str) {
    var map = {
        "&amp;": "&",
        "&lt;": "<",
        "&gt;": ">",
        "&quot;": "\"",
        "&#39;": "'"
    };
    return str.replace(/(&amp;|&lt;|&gt;|&quot;|&#39;)/g, function(m) { return map[m]; });
}
function htmlentities(str) {
    var textarea = document.createElement("textarea");
    textarea.innerHTML = str;
    return textarea.innerHTML;
}
function htmlentities_decode(str) {
    var textarea = document.createElement("textarea");
    textarea.innerHTML = str;
    return textarea.value;
}

http://pastebin.com/JGCVs0Ts

Answer 8

回答by Dave Brown

function encode(r) {
  return r.replace(/[\x26\x0A\x3c\x3e\x22\x27]/g, function(r) {
 return "&#" + r.charCodeAt(0) + ";";
  });
}

test.value=encode('How to encode\nonly html tags &<>\'" nice & fast!');

/*
 \x26 is &ampersand (it has to be first),
 \x0A is newline,
 \x22 is ",
 \x27 is ',
 \x3c is <,
 \x3e is >
*/

<textarea id=test rows=11 cols=55>www.WHAK.com</textarea>

Answer 9

回答by iman

Martijn's method as single function with handling "mark (using in javascript) :

Martijn 的方法作为单个函数处理“标记（在 javascript 中使用）：

function escapeHTML(html) {
    var fn=function(tag) {
        var charsToReplace = {
            '&': '&amp;',
            '<': '&lt;',
            '>': '&gt;',
            '"': '&#34;'
        };
        return charsToReplace[tag] || tag;
    }
    return html.replace(/[&<>"]/g, fn);
}

Answer 10

回答by gilmatic

I'm not entirely sure about speed, but if you are looking for simplicity I would suggest using the lodash/underscore escapefunction.

我不完全确定速度，但如果您正在寻找简单性，我建议使用 lodash/underscore转义函数。

Javascript 将 HTML 标签转义为 HTML 实体的最快方法？

提问by callum

采纳答案by Martijn

回答by Web_Designer

回答by Aram Kocharyan

回答by Julien Kronegg

回答by Todd

回答by Kevin Hakanson

回答by baptx

回答by Dave Brown

回答by iman

回答by gilmatic

相关推荐

最近更新

标签

Javascript 将 HTML 标签转义为 HTML 实体的最快方法？

提问by callum

采纳答案by Martijn

回答by Web_Designer

回答by Aram Kocharyan

回答by Julien Kronegg

回答by Todd

回答by Kevin Hakanson

回答by baptx

回答by Dave Brown

回答by iman

回答by gilmatic

相关推荐

Javascript 在状态改变时反应改变类名

Javascript 空数组似乎同时等于 true 和 false

Javascript Jquery 多选事件处理程序

Javascript 在没有 TypeScript 转译器的情况下使用 Angular 2

相关推荐

最近更新

标签