javascript 从 JS 中的 HTML 标签中删除 id、style、class 属性的正则表达式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12360268/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 16:01:18  来源:igfitidea点击:

A regex to remove id, style, class attributes from HTML tags in JS

javascriptregex

提问by Jimmy Page

I got a html String in javascript and using regex I want to remove id, style and class attributes in html tags, for example I have:

我在 javascript 中得到了一个 html String 并使用正则表达式我想删除 html 标签中的 id、style 和 class 属性,例如我有:

New York City.<div style="padding:20px" id="upp" class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>

I want this String to become:

我希望这个字符串变成:

New York City.<div><div>This message is.</div></div>

回答by You

Instead of parsing the HTML using regular expressions, which is a bad idea, you could take advantage of the DOM functionality that is available in all browsers. We need to be able to walk the DOM tree first:

您可以利用所有浏览器中都可用的 DOM 功能,而不是使用正则表达式解析 HTML(这是一个坏主意)。我们首先需要能够遍历 DOM 树:

var walk_the_DOM = function walk(node, func) {
    func(node);
    node = node.firstChild;
    while (node) {
        walk(node, func);
        node = node.nextSibling;
    }
};

Now parse the string and manipulate the DOM:

现在解析字符串并操作 DOM:

var wrapper= document.createElement('div');
wrapper.innerHTML= '<!-- your HTML here -->';
walk_the_DOM(wrapper.firstChild, function(element) {
    if(element.removeAttribute) {
        element.removeAttribute('id');
        element.removeAttribute('style');
        element.removeAttribute('class');
    }
});
result = wrapper.innerHTML;

See also this JSFiddle.

另请参阅此 JSFiddle

回答by kennebec

If you are willing to remove everything but the div tag names-

如果您愿意删除除 div 标签名称以外的所有内容 -

string=string.replace(/<(div)[^>]+>/ig,'<>');

This will return <DIV>if the html is upper Case.

<DIV>如果 html 是大写,这将返回。

回答by Alejandro Salamanca Mazuelo

Use regular expression. That is fast (in production time) and easy (in development time).

使用正则表达式。这很快(在生产时间)和容易(在开发时间)。

htmlCode = htmlCode.replace(/<([^ >]+)[^>]*>/ig,'<>');

回答by David says reinstate Monica

If you just want to remove the attributes, then regex is the wrong tool. I'd suggest, instead:

如果您只想删除属性,那么 regex 是错误的工具。我建议,而不是:

function stripAttributes(elem){
    if (!elem) {
        return false;
    }
    else {
        var attrs = elem.attributes;
        while (attrs.length) {
            elem.removeAttribute(attrs[0].name);
        }
    }
}

var div = document.getElementById('test');

stripAttributes(div);

?JS Fiddle demo.

? JS小提琴演示

回答by Cem Y?ld?z

i used this

我用过这个

var html = 'New York City.<div style="padding:20px" id="upp"
class="upper"><div style="background:#F2F2F2; color:black; font-size:90%; padding:10px 10px; width:500px;">This message is.</div></div>';

function clear_attr(str,attrs){
    var reg2 = /\s*(\w+)=\"[^\"]+\"/gm;
    var reg = /<\s*(\w+).*?>/gm;
    str = str.replace(reg,function(match, i) {
        var r_ = match.replace(reg2,function(match_, i) {
            var reg2_ = /\s*(\w+)=\"[^\"]+\"/gm;
            var m = reg2_.exec(match_);
            if(m!=null){
                if(attrs.indexOf(m[1])>=0){
                    return match_;
                }
            }
            return '';
        });        
        return r_;
    });
    return str;
}
clear_attr(html,[]);

回答by Devplex

I don't know about RegEx, but I sure as hell know about jQuery.

我不知道 RegEx,但我肯定知道 jQuery。

Convert the given HTML string into a DOM element, parse it, and return its contents.

将给定的 HTML 字符串转换为 DOM 元素,解析它并返回其内容。

function cleanStyles(html){
    var temp = $(document.createElement('div'));
        temp.html(html);

        temp.find('*').removeAttr('style');
        return temp.html();
}

回答by Elias Zamaria

Trying to parse HTML with regexes will cause problems. This answermay be helpful in explaining them. If you are using jQuery, you may be able to do something like this:

尝试使用正则表达式解析 HTML 会导致问题。这个答案可能有助于解释它们。如果您使用的是 jQuery,则可以执行以下操作:

var transformedHtml = $(html).find("*").removeAttr("id").removeAttr("style").removeAttr("class").outerHTML()

For this to work, you need to be using the outerHTML plugin described here.

为此,您需要使用此处描述outerHTML 插件

If you don't want to use jQuery, it will be trickier. These question may have some helpful answers as to how to convert the string to a collection of DOM elements: Converting HTML string into DOM elements?, Creating a new DOM element from an HTML string using built-in DOM methods or prototype. You may be able to loop through the elements and remove the attributes using the built-in removeAttrfunction. I don't have the time or motivation to figure out all the details for you.

如果你不想使用 jQuery,那会比较棘手。这些问题可能有一些关于如何将字符串转换为 DOM 元素集合的有用答案:将 HTML 字符串转换为 DOM 元素?,使用内置 DOM 方法或原型从 HTML 字符串创建新的 DOM 元素。您可以使用内置的removeAttr函数遍历元素并删除属性。我没有时间或动力为您弄清楚所有细节。

回答by RobG

A plain script solution would be something like:

一个简单的脚本解决方案是这样的:

function removeProperties(markup) {
  var div = document.createElement('div');
  div.innerHTML = markup;
  var el, els = div.getElementsByTagName('*');

  for (var i=0, iLen=els.length; i<iLen; i++) {
    el = els[i];
    el.id = '';
    el.style = '';
    el.className = '';
  }
  // now add elements to the DOM
  while (div.firstChild) {
   // someElement.appendChild(div.firstChild);
  }
}

A more general solution would get the property names as extra arguments, or say a space separated string, then iterate over the names to remove them.

一个更通用的解决方案是将属性名称作为额外的参数,或者说一个空格分隔的字符串,然后遍历名称以删除它们。