如何使用 JavaScript 正则表达式获取 html 标签属性值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21323721/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-27 20:38:50  来源:igfitidea点击:

How to get html tag attribute values using JavaScript Regular Expressions?

javascriptregexnode.js

提问by Obay

Suppose I have this HTML in a string:

假设我在一个字符串中有这个 HTML:

<meta http-equiv="Set-Cookie" content="COOKIE1_VALUE_HERE">
<meta http-equiv="Set-Cookie" content="COOKIE2_VALUE_HERE">
<meta http-equiv="Set-Cookie" content="COOKIE3_VALUE_HERE">

And I have this regular expression, to get the values inside the contentattributes:

我有这个正则表达式,用于获取content属性中的值:

/<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig

How do I, in JavaScript, get all three contentvalues?

在 JavaScript 中,我如何获得所有三个content值?

I've tried:

我试过了:

var setCookieMetaRegExp = /<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig;
var match = setCookieMetaRegExp.exec(htmlstring);

but matchdoesn't contain the values I need. Help?

match不包含我需要的值。帮助?

Note: the regular expression is already correct (see here). I just need to match it to the string. Note: I'm using NodeJS

注意:正则表达式已经正确(见这里)。我只需要将它与字符串匹配。注意:我正在使用 NodeJS

采纳答案by Sean Johnson

You were so close! All that needs to be done now is a simple loop:

你离得太近了!现在需要做的只是一个简单的循环:

var htmlString = '<meta http-equiv="Set-Cookie" content="COOKIE1_VALUE_HERE">\n'+
'<meta http-equiv="Set-Cookie" content="COOKIE2_VALUE_HERE">\n'+
'<meta http-equiv="Set-Cookie" content="COOKIE3_VALUE_HERE">\n';

var setCookieMetaRegExp = /<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig;

var matches = [];
while (setCookieMetaRegExp.exec(htmlString)) {
  matches.push(RegExp.);
}

//contains all cookie values
console.log(matches);

JSBIN: http://jsbin.com/OpepUjeW/1/edit?js,console

JSBIN:http://jsbin.com/OpepUjeW/1/edit?js,console

回答by Floris

Keep it simple:

把事情简单化:

/content=\"(.*?)\">/gi

demo: http://regex101.com/r/dF9cD8

演示:http: //regex101.com/r/dF9cD8

Update (based on your comment):

更新(根据您的评论):

/<meta http-equiv=\"Set-Cookie\" content=\"(.*?)\">/gi

runs only on this exact string. Demo: http://regex101.com/r/pT0fC2

只在这个确切的字符串上运行。演示:http: //regex101.com/r/pT0fC2

You really need the (.*?)with the question mark, or the regex will keep going until the last >it finds (or newline). The ?makes the search stop at the first "(you can change this to [\"']if you want to match either single or double quote).

您确实需要(.*?)带问号的 ,否则正则表达式将一直运行到>它找到的最后一个(或换行符)。这?使搜索在第一个停止"[\"']如果要匹配单引号或双引号,可以将其更改为)。

回答by Patrick Evans

no need for regular expressions just do some dom work

不需要正则表达式只需做一些 dom 工作

var head = document.createElement("head");
head.innerHTML = '<meta http-equiv="Set-Cookie" content="COOKIE1_VALUE_HERE"><meta http-equiv="Set-Cookie" content="COOKIE2_VALUE_HERE"><meta http-equiv="Set-Cookie" content="COOKIE3_VALUE_HERE">';

var metaNodes = head.childNodes;
for(var i=0; i<metaNodes.length; i++){
   var contentValue = metaNodes[i].attributes.getNamedItem("content").value;
}

As you are using nodejs and BlackSheep mentions using cheerioyou could use their syntax if you wish to use that lib:

当您使用 nodejs 和 BlackSheep 时,cheerio如果您想使用该库,可以使用它们的语法:

//Assume htmlString contains the html
var cheerio = require('cheerio'),
$ = cheerio.load(htmlString);
var values=[];
$("meta").each(function(i, elem) {
  values[i] = $(this).attr("content");
});

回答by i-bob

try this:

试试这个:

var setCookieMetaRegExp = "/<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig";
var match = stringToFindPartFrom.match(setCookieMetaRegExp);

回答by i-bob

Try this:

试试这个:

var myString = '<meta http-equiv="Set-Cookie" content="COOKIE2_VALUE_HERE">';
var myRegexp = /<meta http-equiv=[\"']?set-cookie[\"']? content=[\"'](.*)[\"'].*>/ig;
var match = myRegexp.exec(myString);
alert(match[1]); // should show you the part

回答by AhbapAldirmaz

Try this

试试这个

(?:class|href)([\s='"./]+)([\w-./?=&\#"]+)((['#\&?=/".\w\d]+|[\w)('-."\s]+)['"]|)

example :

例子 :

function getTagAttribute(tag, attribute){    
    var regKey = '(?:' + attribute + ')([\s=\'"./]+)([\w-./?=\#"]+)(([\'#\&?=/".\w\d]+|[\w)(\'-."\s]+)[\'"]|)'
    var regExp = new RegExp(regKey,'g');
    var regResult = regExp.exec(tag);   
    if(regResult && regResult.length>0){                        
        var splitKey = '(?:(' + attribute + ')+(|\s)+([=])+(|\s|[\'"])+)|(?:([\s\'"]+)$)'                
        return regResult[0].replace(new RegExp(splitKey,'g'),'');
    }else{
        return '';
    }
}


getTagAttribute('<a href  =   "./test.html#bir/deneme/?k=1&v=1"    class=   "xyz_bir-ahmet abc">','href');'

//return  "./test.html#bir/deneme/?k=1&v=1"

Live Regexp101

实时 Regexp101

Live JS Script Example

实时 JS 脚本示例