Javascript 如何在javascript中转义xml实体?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7918868/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 04:03:57  来源:igfitidea点击:

how to escape xml entities in javascript?

javascript

提问by Zo72

In JavaScript (server side nodejs) I'm writing a program which generates xml as output.

在 JavaScript(服务器端 nodejs)中,我正在编写一个生成 xml 作为输出的程序。

I am building the xml by concatenating a string:

我通过连接一个字符串来构建 xml:

str += '<' + key + '>';
str += value;
str += '</' + key + '>';

The problem is: What if valuecontains characters like '&', '>'or '<'? What's the best way to escape those characters?

问题是:如果value包含诸如'&','>'或 之类的字符'<'怎么办?逃避这些角色的最佳方法是什么?

or is there any javascript library around which can escape XML entities?

或者是否有任何可以转义 XML 实体的 javascript 库?

回答by zzzzBov

HTML encoding is simply replacing &, ", ', <and >chars with their entity equivalents. Order matters, if you don't replace the &chars first, you'll double encode some of the entities:

HTML编码简单地更换&"'<>与他们的实体当量字符。顺序很重要,如果您不先替换&字符,您将对某些实体进行双重编码:

if (!String.prototype.encodeHTML) {
  String.prototype.encodeHTML = function () {
    return this.replace(/&/g, '&amp;')
               .replace(/</g, '&lt;')
               .replace(/>/g, '&gt;')
               .replace(/"/g, '&quot;')
               .replace(/'/g, '&apos;');
  };
}

As @Johan B.W. de Vries pointed out, this will have issues with the tag names, I would like to clarify that I made the assumption that this was being used for the valueonly

正如@Johan BW de Vries 指出的那样,这将与标签名称有关,我想澄清一下,我假设它用于value

Conversely if you want to decode HTML entities1, make sure you decode &amp;to &after everything else so that you don't double decode any entities:

相反,如果您想解码 HTML 实体1,请确保在其他所有内容之后解码&amp;&以免对任何实体进行双重解码:

if (!String.prototype.decodeHTML) {
  String.prototype.decodeHTML = function () {
    return this.replace(/&apos;/g, "'")
               .replace(/&quot;/g, '"')
               .replace(/&gt;/g, '>')
               .replace(/&lt;/g, '<')
               .replace(/&amp;/g, '&');
  };
}

1just the basics, not including &copy;to ?or other such things

1只是基础,不包括&copy;to?或其他类似的东西



As far as libraries are concerned. Underscore.js(or Lodashif you prefer) provides an _.escapemethod to perform this functionality.

就图书馆而言。Underscore.js(或Lodash如果你愿意)提供了一个_.escape执行此功能的方法。

回答by hgoebl

This might be a bit more efficient with the same outcome:

对于相同的结果,这可能会更有效一些:

function escapeXml(unsafe) {
    return unsafe.replace(/[<>&'"]/g, function (c) {
        switch (c) {
            case '<': return '&lt;';
            case '>': return '&gt;';
            case '&': return '&amp;';
            case '\'': return '&apos;';
            case '"': return '&quot;';
        }
    });
}

回答by lambshaanxy

If you have jQuery, here's a simple solution:

如果你有 jQuery,这里有一个简单的解决方案:

  String.prototype.htmlEscape = function() {
    return $('<div/>').text(this.toString()).html();
  };

Use it like this:

像这样使用它:

"<foo&bar>".htmlEscape();-> "&lt;foo&amp;bar&gt"

"<foo&bar>".htmlEscape();-> "&lt;foo&amp;bar&gt"

回答by sudhAnsu63

you can use the below method. I have added this in prototype for easier access. I have also used negative look-ahead so it wont mess things, if you call the method twice or more.

您可以使用以下方法。我在原型中添加了它以便于访问。我还使用了负前瞻,因此如果您调用该方法两次或更多次,它不会弄乱事情。

Usage:

用法:

 var original = "Hi&there";
 var escaped = original.EncodeXMLEscapeChars();  //Hi&amp;there

Decoding is automaticaly handeled in XML parser.

解码是在 XML 解析器中自动处理的。

Method :

方法 :

//String Extenstion to format string for xml content.
//Replces xml escape chracters to their equivalent html notation.
String.prototype.EncodeXMLEscapeChars = function () {
    var OutPut = this;
    if ($.trim(OutPut) != "") {
        OutPut = OutPut.replace(/</g, "&lt;").replace(/>/g, "&gt;").replace(/"/g, "&quot;").replace(/'/g, "&#39;");
        OutPut = OutPut.replace(/&(?!(amp;)|(lt;)|(gt;)|(quot;)|(#39;)|(apos;))/g, "&amp;");
        OutPut = OutPut.replace(/([^\])((\\)*)\(?![\/{])/g, "\\");  //replaces odd backslash(\) with even.
    }
    else {
        OutPut = "";
    }
    return OutPut;
};

回答by jordancpaul

I originally used the accepted answer in production code and found that it was actually really slow when used heavily. Here is a muchfaster solution (runs at over twice the speed):

我最初在生产代码中使用了公认的答案,发现大量使用时它实际上非常慢。这里是一个很大更快的解决方案(运行超过两倍的速度):

   var escapeXml = (function() {
        var doc = document.implementation.createDocument("", "", null)
        var el = doc.createElement("temp");
        el.textContent = "temp";
        el = el.firstChild;
        var ser =  new XMLSerializer();
        return function(text) {
            el.nodeValue = text;
            return ser.serializeToString(el);
        };
    })();

console.log(escapeXml("<>&")); //&lt;&gt;&amp;

回答by crown

maybe you can try this,

也许你可以试试这个

function encodeXML(s) {
  const dom = document.createElement('div')
  dom.textContent = s
  return dom.innerHTML
}

reference

参考

回答by Stefan Steiger

Caution, all the regexing isn't good if you have XML inside XML.
Instead loop over the string once, and substitute all escape characters.
That way, you can't run over the same character twice.

注意,如果您在 XML 中有 XML,则所有正则表达式都不好。
而是循环一次字符串,并替换所有转义字符。
这样,你就不能两次碾压同一个角色。

function _xmlAttributeEscape(inputString)
{
    var output = [];

    for (var i = 0; i < inputString.length; ++i)
    {
        switch (inputString[i])
        {
            case '&':
                output.push("&amp;");
                break;
            case '"':
                output.push("&quot;");
                break;
            case "<":
                output.push("&lt;");
                break;
            case ">":
                output.push("&gt;");
                break;
            default:
                output.push(inputString[i]);
        }


    }

    return output.join("");
}

回答by Lostfields

if something is escaped from before, you could try this since this will not double escape like many others

如果某些东西是从之前逃脱的,你可以试试这个,因为这不会像许多其他人一样双重逃脱

function escape(text) {
    return String(text).replace(/(['"<>&'])(\w+;)?/g, (match, char, escaped) => {
        if(escaped) 
            return match

        switch(char) {
            case '\'': return '&quot;'
            case '"': return '&apos;'
            case '<': return '&lt;'
            case '>': return '&gt;'
            case '&': return '&amp;'
        }
    })
}

回答by Johan B.W. de Vries

Technically, &, < and > aren't valid XML entity name characters. If you can't trust the key variable, you should filter them out.

从技术上讲,&、< 和 > 不是有效的 XML 实体名称字符。如果您不能信任关键变量,则应将其过滤掉。

If you want them escaped as HTML entities, you could use something like http://www.strictly-software.com/htmlencode.

如果您希望它们作为 HTML 实体转义,您可以使用类似http://www.strictly-software.com/htmlencode 的内容

回答by Per Ghosh

This is simple:

这很简单:

sText = ("" + sText).split("<").join("&lt;").split(">").join("&gt;").split('"').join("&#34;").split("'").join("&#39;");