正则表达式查找标签 ID 和内容 JavaScript

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3271061/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-25 00:42:55  来源:igfitidea点击:

regex to find tag id and content JavaScript

javascriptregexelements

提问by Thomas

Hey I'm trying to do something quite specific with regex in javascript and my regexp-foo is shakey at best. Wondered if there were any pros out there who could point me in the right direction. So I have some text...

嘿,我正在尝试使用 javascript 中的正则表达式做一些非常具体的事情,而我的 regexp-foo 充其量是摇摇欲坠的。想知道是否有任何专业人士可以为我指明正确的方向。所以我有一些文字...

<item id="myid1">myitem1</item>
<item id="myid2">myitem2</item>

...etc

...等等

And I would like to strip it out into an array that reads myid1, myitem1, myid2, myitem2, ....etc

我想把它剥离成一个数组,读取 myid1, myitem1, myid2, myitem2, ....etc

There will never be nested elements so there is no recursive nesting problem. Anyone able to bash this out quickly? Thanks for your help!

永远不会有嵌套元素,因此不存在递归嵌套问题。有谁能快速解决这个问题吗?谢谢你的帮助!

回答by Chris

Here's a regex that will:

这是一个正则表达式,它将:

  • Match the starting and ending tag element names
  • Extract the value of the id attribute
  • Extract the inner html contents of the tag
  • 匹配开始和结束标签元素名称
  • 提取 id 属性的值
  • 提取标签的内部html内容

Note: I am being lazy in matching the attribute value here. It needs to be enclosed in double quotes, and there needs to be no spaces between the attribute name and its value.

注意:我很懒惰在这里匹配属性值。需要用双引号括起来,属性名和值之间不能有空格。

<([^\s]+).*?id="([^"]*?)".*?>(.+?)</>

Running the regex in javascript would be done like so:

在 javascript 中运行正则表达式将像这样完成:

search = '<item id="item1">firstItem</item><item id="item2">secondItem</item>';
regex = new RegExp(/<([^\s]+).*?id="([^"]*?)".*?>(.+?)<\/>/gi);
matches = search.match(regex);
results = {};
for (i in matches) {
    parts = regex.exec(matches[i]);
    results[parts[2]] = parts[3];
}

At the end of this, resultswould be an object that looks like:

最后,results将是一个如下所示的对象:

{
    "item1": "firstItem",
    "item2": "secondItem"
}

YMMV if the <item> elements contain nested HTML.

YMMV 如果 <item> 元素包含嵌套的 HTML。

回答by Roy Shoa

If someone really like or need to use Regex to get an HTML tag by id (like the in the question subject), he can use my code:

如果有人真的喜欢或需要使用 Regex 通过 id 获取 HTML 标签(如问题主题中的),他可以使用我的代码:

function GetTagByIdUsingRegex(tag,id,html) {
    return new RegExp("<" + tag + "[^>]*id[\s]?=[\s]?['\"]" + id + "['\"][\s\S]*?<\/" + tag + ">").exec(html);
}

I made also one to get element by class name:

我也做了一个按类名获取元素:

function GetTagByClassUsingRegex(tag,cls,html) {
    return new RegExp("<" + tag + "[^>]*class[\s]?=[\s]?['\"]" + cls + "[^'\"]*['\"][\s\S]*?<\/" + tag + ">").exec(html);
}

回答by Sjuul Janssen

I always use this site to build my regexes:

我总是使用这个网站来构建我的正则表达式:

http://www.pagecolumn.com/tool/regtest.htm

http://www.pagecolumn.com/tool/regtest.htm

This is the regex I came up with:

这是我想出的正则表达式:

(<[^>]+>)([^<]+)(<[^>]+>)

And this is the result that the page gives me for JavaScript

这是页面为我提供的 JavaScript 结果

Using RegExp object:

使用 RegExp 对象:

var str = "<item id="myid1">myitem1</item><item id="myid2">myitem2</item><ssdad<sdasda><>dfsf";
var re = new RegExp("(<[^>]+>)([^<]+)(<[^>]+>)", "g");
var myArray = str.match(re);

Using literal:

使用文字:

var myArray = str.match(/(<[^>]+>)([^<]+)(<[^>]+>)/g)

if ( myArray != null) {
    for ( i = 0; i < myArray.length; i++ ) { 
        var result = "myArray[" + i + "] = " + myArray[i];
    }
}

回答by FK82

This is a xml string. A XML parserseems suited best for this kind of task in my opinion. Do the following:

这是一个 xml 字符串。一个XML解析器似乎适合最适合这样的任务在我看来。请执行下列操作:

var items = document.getElementsByTagName("item") ; //<> use the parent element if document is not
var dataArray = [ ] ;

for(var n = 0 ; n < items.length ; n++) {

     var id = items[n].id ;
     var text = items[n].childNodes[0] ;

         dataArray.push(id,text) ;

}


If your problem is that you cannot convert the xml string to an xml object, you will have to use a DOM parserbeforehand:

如果您的问题是无法将 xml 字符串转换为 xml 对象,则必须事先使用DOM 解析器

var xmlString = "" ; //!! your xml string
var document = null ;

    if (window.ActiveXObject) { //!! for internet explorer

            document = new ActiveXObject("Microsoft.XMLDOM") ;
            document.async = "false" ;
            document.loadXML(xmlString) ;

    } else { //!! for everything else

        var parser = new DOMParser() ;
            document = parser.parseFromString(xmlString,"text/xml") ;

    }

Then use the above script.

然后使用上面的脚本。