php Preg在html标签之间匹配php中的文本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1586779/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 03:11:28  来源:igfitidea点击:

Preg match text in php between html tags

phpparsingpreg-match

提问by David Willis

Hello I would like to use preg_match in PHP to parse the "Desired text" out of the following from a html document

您好,我想在 PHP 中使用 preg_match 从 html 文档中解析以下“所需文本”

<p class="review"> Desired text </p>

Ordinarily I would use simple_html_dom for such things but on this occasion it cannot be used (the above element doesn't appear in every desired div tag so I'm forced to use this approach to keep track of exactly when it doesn't appear and then adjust my array from simple_html_dom accordingly).

通常我会使用 simple_html_dom 做这样的事情,但在这种情况下它不能使用(上面的元素没有出现在每个想要的 div 标签中,所以我被迫使用这种方法来准确跟踪它什么时候没有出现和然后相应地从 simple_html_dom 调整我的数组)。

Anyway, this would solve my problem.

无论如何,这将解决我的问题。

Thanks so much.

非常感谢。

回答by serg

preg_match("'<p class=\"review\">(.*?)</p>'si", $source, $match);
if($match) echo "result=".$match[1];

回答by Andy N

if you want to return multiple matches then need to use preg_match_all(). You then loop through the second result group ($match[1]) to get just the content between tags.

如果要返回多个匹配项,则需要使用 preg_match_all()。然后循环遍历第二个结果组 ($match[1]) 以获取标签之间的内容。

$source = "<p class=\"review\"> Desired text1 </p>".
"<p class=\"review\"> Desired text2 </p>".
"<p class=\"review\"> Desired text3 </p>";


    preg_match_all("'<p class=\"review\">(.*?)</p>'si", $source, $match);

    foreach($match[1] as $val)
    {
        echo $val."<br>";


    }

Outputs:

Desired text1
Desired text2
Desired text3 

回答by Ross Snyder

What if the string you're matching has multiple lines and is:

如果您匹配的字符串有多行并且是:

<p class="review"> Desired text1 </p>
<p class="review"> Desired text2 </p>
<p class="review"> Desired text3 </p>

That pattern would match once, and the match would be everything in the string.

该模式将匹配一次,并且匹配将是字符串中的所有内容。

I think a better pattern is:

我认为更好的模式是:

"'<p class=\"review\">([^<]*)</p>'si"