php 在字符串中查找 HTML 标签

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18800807/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 18:14:49  来源:igfitidea点击:

Finding HTML tags in string

phphtmlregexpreg-match

提问by Sven van Zoelen

I know this question is around SO, but I can't find the right one and I still suck in Regex :/

我知道这个问题是关于 SO 的,但是我找不到合适的问题,而且我仍然在使用正则表达式:/

I have an stringand that string is valid HTML. Now I want to find all the tags with an certain nameand attribute.

我有一个string,该字符串是有效的 HTML。现在我想找到所有带有某个name和的标签attribute

I tried this regex (i.e. div with type): /(<div type="my_special_type" src="(.*?)<\/div>)/.

我想这正则表达式(即格型)/(<div type="my_special_type" src="(.*?)<\/div>)/

Example string:

示例字符串:

<div>Do not match me</div>
<div type="special_type" src="bla"> match me</div>
<a>not me</a>
<div src="blaw" type="special_type" > match me too</div>

If I use preg_match then I only get <div type="special_type" src="bla"> match me</div>what is logical because the other one has the attributes in a different order.

如果我使用 preg_match 那么我只会得到<div type="special_type" src="bla"> match me</div>合乎逻辑的东西,因为另一个具有不同顺序的属性。

What regex do I need to get the following arraywhen using preg_matchon the example string?:

在示例字符串上array使用时,我需要什么正则表达式来获得以下内容preg_match?:

array(0 => '<div type="special_type" src="bla"> match me</div>',
      1 => '<div src="blaw" type="special_type" > match me too</div>')

回答by hek2mgl

A general advice: Dont use regex to parse HTMLIt will get messy if the HTML changes..

一般建议:不要使用正则表达式来解析 HTML如果 HTML 更改,它会变得混乱..

Use DOMDocumentinstead:

使用DOMDocument来代替:

$str = <<<EOF
<div>Do not match me</div>
<div type="special_type" src="bla"> match me</div>
<a>not me</a>
<div src="blaw" type="special_type" > match me too</div>
EOF;

$doc = new DOMDocument();
$doc->loadHTML($str);    
$selector = new DOMXPath($doc);

$result = $selector->query('//div[@type="special_type"]');

// loop through all found items
foreach($result as $node) {
    echo $node->getAttribute('src');
}

回答by Kilise

As hek2msql said, you better use DOMDocument

正如 hek2msql 所说,你最好使用 DOMDocument

$html = '
<div>Do not match me</div>
<div type="special_type" src="bla"> match me</div>
<a>not me</a>
<div src="blaw" type="special_type" > match me too</div>';

$matches = get_matched($html);


function get_matched($html){
    $matched = array();

    $dom = new DOMDocument();
    @$dom->loadHtml($html);

    $length = $dom->getElementsByTagName('div')->length;

    for($i=0;$i<$length;$i++){
        $type = $dom->getElementsByTagName("div")->item($i)->getAttribute("type");

        if($type != 'special_type')
            continue;

        $matched[] = $dom->getElementsByTagName("div")->item($i)->getAttribute('src');
    // or   $matched[] = $dom->getElementsByTagName("div")->item($i)->nodeValue;

    }

    return $matched;

}