PHP/regex：如何获取 HTML 标签的字符串值？

Question

提问by marknt15

I need help on regex or preg_matchbecause I am not that experienced yet with regards to those so here is my problem.

我需要正则表达式方面的帮助，或者preg_match因为我在这些方面还没有经验，所以这是我的问题。

I need to get the value "get me" but I think my function has an error. The number of html tags are dynamic. It can contain many nested html tag like a bold tag. Also, the "get me" value is dynamic.

我需要获取值“get me”，但我认为我的函数有错误。html 标签的数量是动态的。它可以包含许多嵌套的 html 标签，如粗体标签。此外，“get me”值是动态的。

<?php
function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname>(.*?)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>

Answer 1

回答by takete.dk

<?php
function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>

That should do the trick

这应该够了吧

Answer 2

回答by pkwebmarket

Try this

尝试这个

$str = '<option value="123">abc</option>
        <option value="123">aabbcc</option>';

preg_match_all("#<option.*?>([^<]+)</option>#", $str, $foo);

print_r($foo[1]);

Answer 3

回答by Tomas Aschan

In your pattern, you simply want to match all textbetween the two tags. Thus, you could use for example a [\w\W]to match all characters.

在您的模式中，您只想匹配两个标签之间的所有文本。因此，您可以使用例如 a[\w\W]来匹配所有字符。

function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname>([\w\W]*?)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

Answer 4

回答by Gumbo

Since attribute values may contain a plain >character, try this regular expression:

由于属性值可能包含普通>字符，请尝试以下正则表达式：

$pattern = '/<'.preg_quote($tagname, '/').'(?:[^"'>]*|"[^"]*"|\'[^\']*\')*>(.*?)<\/'.preg_quote($tagname, '/').'>/s';

But regular expressions are not suitable for parsing non-regular languages like HTML. You should better use a parser like SimpleXMLor DOMDocument.

但是正则表达式不适合解析像 HTML 这样的非正则语言。你最好使用像SimpleXML或DOMDocument这样的解析器。

Answer 5

回答by Gumbo

The following php snippets would return the text between html tags/elements.

以下 php 片段将返回 html 标签/元素之间的文本。

regex : "/tagname(.*)endtag/" will return text between tags.

regex : "/tagname(.*)endtag/" 将返回标签之间的文本。

i.e.

IE


$regex="/[start_tag_name](.*)[/end_tag_name]/";
$content="[start_tag_name]SOME TEXT[/end_tag_name]";
preg_replace($regex,$content);

It will return "SOME TEXT".

它将返回“一些文本”。

Regards,

问候，

Web-Farmer @letsnurture.com

网络农民@letsnurture.com

Answer 6

回答by Xman Classical

$userinput = "http://www.example.vn/";
//$url = urlencode($userinput);
$input = @file_get_contents($userinput) or die("Could not access file: $userinput");
$regexp = "<tagname\s[^>]*>(.*)<\/tagname>";
//==Example:
//$regexp = "<div\s[^>]*>(.*)<\/div>";

if(preg_match_all("/$regexp/siU", $input, $matches, PREG_SET_ORDER)) {
    foreach($matches as $match) {
        // $match[2] = link address 
        // $match[3] = link text
    }
}

Answer 7

回答by Darren Li

try $pattern = "<($tagname)\b.*?>(.*?)</\1>"and return $matches[2]

尝试$pattern = "<($tagname)\b.*?>(.*?)</\1>"和return $matches[2]

PHP/regex：如何获取 HTML 标签的字符串值？

提问by marknt15

回答by takete.dk

回答by pkwebmarket

回答by Tomas Aschan

回答by Gumbo

回答by Gumbo

回答by Xman Classical

回答by Darren Li

相关推荐

最近更新

标签

PHP/regex：如何获取 HTML 标签的字符串值？

提问by marknt15

回答by takete.dk

回答by pkwebmarket

回答by Tomas Aschan

回答by Gumbo

回答by Gumbo

回答by Xman Classical

回答by Darren Li

相关推荐

PHP - 使用数组作为类常量

php http://localhost/ 在 Windows 7 上不工作。有什么问题？

Php：当没有“先前声明”时如何解决“无法重新声明类”

如何使用 curl 将 JSON 发布到 PHP

相关推荐

最近更新

标签