使用 php 获取 <tag> 和 </tag> 之间的所有内容

Question

提问by Nate

I'm trying to grab a string within a string using regex.

我正在尝试使用正则表达式抓取字符串中的字符串。

I've looked and looked but I can't seem to get any of the examples I have to work.

我看了又看，但我似乎无法得到任何我必须工作的例子。

I need to grab the html tags <code> and </code> and everything in between them.

我需要获取 html 标签 <code> 和 </code> 以及它们之间的所有内容。

Then I need to pull the matched string from the parent string, do operations on both,

然后我需要从父字符串中提取匹配的字符串，对两者进行操作，

then put the matched string back into the parent string.

然后将匹配的字符串放回父字符串。

Here's my code:

这是我的代码：

$content = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. &lt;code>Donec sed erat vel diam ultricies commodo. Nunc venenatis tellus eu quam suscipit quis fermentum dolor vehicula.&lt;/code>"
$regex='';
$code = preg_match($regex, $text, $matches);

I've already tried these without success:

我已经尝试过这些但没有成功：

$regex = "/<code\s*(.*)\>(.*)<\/code>/";
$regex = "/<code>(.*)<\/code>/";

Answer 1

回答by piouPiouM

You can use the following:

您可以使用以下内容：

$regex = '#<\s*?code\b[^>]*>(.*?)</code\b[^>]*>#s';

\bensures that a typo (like <codeS>) is not captured.
The first pattern [^>]*captures the content of a tag with attributes (eg a class).
Finally, the flag scapture content with newlines.

\b确保<codeS>不会捕获错字（如）。
第一个模式[^>]*捕获带有属性（例如类）的标签的内容。
最后，标志s用换行符捕获内容。

See the result here : http://lumadis.be/regex/test_regex.php?id=1081

在此处查看结果：http: //lumadis.be/regex/test_regex.php?id=1081

Answer 2

回答by Joe

$regex = '#<code>(.*?)</code>#';

Using #as the delimiter instead of /because then we don't need to escape the /in </code>

使用#作为分隔符，而不是/因为那时我们不需要逃避/的</code>

As Phoenix posted below, .*?is used to make the .*("anything") match as few characters as possible before it comes across a </code>(known as a "non-greedy quantifier"). That way, if your string is

正如 Phoenix 在下面发布的那样，.*?用于使.*（“任何东西”）在遇到 a </code>（称为“非贪婪量词”）之前匹配尽可能少的字符。这样，如果你的字符串是

<code>hello</code> something <code>again</code>

you'll match helloand againinstead of just matching hello</code> something <code>again.

你会匹配hello而again不是仅仅匹配hello</code> something <code>again.

Answer 3

回答by moni as

this function worked for me

这个功能对我有用

<?php

function everything_in_tags($string, $tagname)
{
    $pattern = "#<\s*?$tagname\b[^>]*>(.*?)</$tagname\b[^>]*>#s";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

?>

Answer 4

回答by Alberto

you can use /<code>([\s\S]*)<\/code>/msUthis catch NEWLINES too!

你也可以使用/<code>([\s\S]*)<\/code>/msU这个 catch NEWLINES！

Answer 5

回答by Nate

function contentDisplay($text)
{
    //replace UTF-8
    $convertUT8 = array("\xe2\x80\x98", "\xe2\x80\x99", "\xe2\x80\x9c", "\xe2\x80\x9d", "\xe2\x80\x93", "\xe2\x80\x94", "\xe2\x80\xa6");
    $to = array("'", "'", '"', '"', '-', '--', '...');
    $text = str_replace($convertUT8,$to,$text);

    //replace Windows-1252
    $convertWin1252 = array(chr(145), chr(146), chr(147), chr(148), chr(150), chr(151), chr(133));
    $to = array("'", "'", '"', '"', '-', '--', '...');
    $text = str_replace($convertWin1252,$to,$text);

    //replace accents
    $convertAccents = array('à', 'á', '?', '?', '?', '?', '?', '?', 'è', 'é', 'ê', '?', 'ì', 'í', '?', '?', 'D', '?', 'ò', 'ó', '?', '?', '?', '?', 'ù', 'ú', '?', 'ü', 'Y', '?', 'à', 'á', 'a', '?', '?', '?', '?', '?', 'è', 'é', 'ê', '?', 'ì', 'í', '?', '?', '?', 'ò', 'ó', '?', '?', '?', '?', 'ù', 'ú', '?', 'ü', 'y', '?', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'D', 'd', 'D', 'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G', 'g', 'G', 'g', 'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', '?', '?', 'J', 'j', 'K', 'k', 'L', 'l', 'L', 'l', 'L', 'l', '?', '?', 'L', 'l', 'N', 'n', 'N', 'n', 'N', 'n', '?', 'O', 'o', 'O', 'o', 'O', 'o', '?', '?', 'R', 'r', 'R', 'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', '?', '?', 'T', 't', 'T', 't', 'T', 't', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'W', 'w', 'Y', 'y', '?', 'Z', 'z', 'Z', 'z', '?', '?', '?', '?', 'O', 'o', 'U', 'u', 'A', 'a', 'I', 'i', 'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', '?', '?', '?', '?', '?', '?');
    $to = array('A', 'A', 'A', 'A', 'A', 'A', 'AE', 'C', 'E', 'E', 'E', 'E', 'I', 'I', 'I', 'I', 'D', 'N', 'O', 'O', 'O', 'O', 'O', 'O', 'U', 'U', 'U', 'U', 'Y', 's', 'a', 'a', 'a', 'a', 'a', 'a', 'ae', 'c', 'e', 'e', 'e', 'e', 'i', 'i', 'i', 'i', 'n', 'o', 'o', 'o', 'o', 'o', 'o', 'u', 'u', 'u', 'u', 'y', 'y', 'A', 'a', 'A', 'a', 'A', 'a', 'C', 'c', 'C', 'c', 'C', 'c', 'C', 'c', 'D', 'd', 'D', 'd', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'E', 'e', 'G', 'g', 'G', 'g', 'G', 'g', 'G', 'g', 'H', 'h', 'H', 'h', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'I', 'i', 'IJ', 'ij', 'J', 'j', 'K', 'k', 'L', 'l', 'L', 'l', 'L', 'l', 'L', 'l', 'l', 'l', 'N', 'n', 'N', 'n', 'N', 'n', 'n', 'O', 'o', 'O', 'o', 'O', 'o', 'OE', 'oe', 'R', 'r', 'R', 'r', 'R', 'r', 'S', 's', 'S', 's', 'S', 's', 'S', 's', 'T', 't', 'T', 't', 'T', 't', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'W', 'w', 'Y', 'y', 'Y', 'Z', 'z', 'Z', 'z', 'Z', 'z', 's', 'f', 'O', 'o', 'U', 'u', 'A', 'a', 'I', 'i', 'O', 'o', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'U', 'u', 'A', 'a', 'AE', 'ae', 'O', 'o');
    $text = str_replace($convertAccents,$to,$text);

    //Encode the characters
    $text = htmlentities($text);

    //normalize the line breaks (here because it applies to all text)
    $text = str_replace("\r\n", "\n", $text);
    $text = str_replace("\r", "\n", $text);

    //decode the <code> tags
    $codeOpen = htmlentities('<').'code'.htmlentities('>');
    if (strpos($text, $codeOpen))
    {
        $text = str_replace($codeOpen, html_entity_decode(htmlentities('<')) . "code" . html_entity_decode(htmlentities('>')), $text);
    }
    $codeOpen = htmlentities('<').'/code'.htmlentities('>');
    if (strpos($text, $codeOpen))
    {
        $text = str_replace($codeOpen, html_entity_decode(htmlentities('<')) . "/code" . html_entity_decode(htmlentities('>')), $text);
    }

    //match everything between <code> and </code>, the msU is what makes this work here, ADD this to REGEX archive
    $regex = '/<code>(.*)<\/code>/msU';
    $code = preg_match($regex, $text, $matches);
    if ($code == 1)
    {
        if (is_array($matches) && count($matches) >= 2)
        {
            $newcode = $matches[1];

            $newcode = nl2br($newcode);
        }

    //remove <code>and this</code> from $text;
    $text = str_replace('<code>' . $matches[1] . '</code>', 'PLACEHOLDERCODE1', $text);

    //convert the line breaks to paragraphs
    $text = '<p>' . str_replace("\n\n", '</p><p>', $text) . '</p>';
    $text = str_replace("\n" , '<br />', $text);
    $text = str_replace('</p><p>', '</p>' . "\n\n" . '<p>', $text);

    $text = str_replace('PLACEHOLDERCODE1', '<code>'.$newcode.'</code>', $text);
    }
    else
    {
        $code = false;
    }

    if ($code == false)
    {
        //convert the line breaks to paragraphs
        $text = '<p>' . str_replace("\n\n", '</p><p>', $text) . '</p>';
        $text = str_replace("\n" , '<br />', $text);
        $text = str_replace('</p><p>', '</p>' . "\n\n" . '<p>', $text);
    }

    return $text;
}

Answer 6

回答by Milind Singh

You can also try:

你也可以试试：

function getTagValue($string, $tag)
{
    $pattern = "/<{$tag}>(.*?)<\/{$tag}>/s";
    preg_match($pattern, $string, $matches);
    return isset($matches[1]) ? $matches[1] : '';
}

It returns empty string in case of no match.

如果不匹配，则返回空字符串。

使用 php 获取 <tag> 和 </tag> 之间的所有内容

提问by Nate

回答by piouPiouM

回答by Joe

回答by moni as

回答by Alberto

回答by Nate

回答by Milind Singh

相关推荐

最近更新

标签

使用 php 获取 <tag> 和 </tag> 之间的所有内容

提问by Nate

回答by piouPiouM

回答by Joe

回答by moni as

回答by Alberto

回答by Nate

回答by Milind Singh

相关推荐

php 包含到外部 url

使用 PHP 打开或创建文件时权限被拒绝

php-fpm 和 nginx 会话问题

php 特性与接口

相关推荐

最近更新

标签