php PHP正则表达式删除HTML文档中的标签

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1364974/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 02:13:00  来源:igfitidea点击:

PHP regular expression to remove tags in HTML document

phpregexpreg-replacehtml-parsing

提问by Se?or Reginold Francis

Say I have the following text

说我有以下文字

..(content).............
<A HREF="http://foo.com/content" >blah blah blah </A>
...(continue content)...

I want to delete the link and I want to delete the tag (while keeping the text in between). How do I do this with a regular expression (since the URLs will all be different)

我想删除链接,我想删除标签(同时保留两者之间的文本)。如何使用正则表达式执行此操作(因为 URL 将全部不同)

Much thanks

非常感谢

回答by nickf

This will remove all tags:

这将删除所有标签:

preg_replace("/<.*?>/", "", $string);

This will remove just the <a>tags:

这将仅删除<a>标签:

preg_replace("/<\/?a(\s+.*?>|>)/", "", $string);

回答by soulmerge

Avoid regular expressions whenever you can, especially when processing xml. In this case you can use strip_tags()or simplexml, depending on your string.

尽可能避免使用正则表达式,尤其是在处理 xml 时。在这种情况下,您可以使用strip_tags()simplexml,具体取决于您的字符串。

回答by karim79

<?php
//example to extract the innerText from all anchors in a string
include('simple_html_dom.php');

$html = str_get_html('<A HREF="http://foo.com/content" >blah blah blah </A><A HREF="http://foo.com/content" >blah blah blah </A>');

//print the text of each anchor    
foreach($html->find('a') as $e) {
    echo $e->innerText;
}
?>

See PHP Simple DOM Parser.

请参阅PHP 简单 DOM 解析器

回答by Rufinus

Not pretty but does the job:

不漂亮,但可以完成工作:

$data = str_replace('</a>', '', $data);
$data = preg_replace('/<a[^>]+href[^>]+>/', '', $data);

回答by MIV1987

strip_tags()can also be used.

strip_tags()也可以使用。

Please see examples here.

在此处查看示例。

回答by Paulo Peres Junior

$pattern = '/href="([^"]*)"/';

回答by SoN9ne

I use this to replace the anchors with a text string...

我用它来用文本字符串替换锚...

function replaceAnchorsWithText($data) {
        $regex  = '/(<a\s*'; // Start of anchor tag
        $regex .= '(.*?)\s*'; // Any attributes or spaces that may or may not exist
        $regex .= 'href=[\'"]+?\s*(?P<link>\S+)\s*[\'"]+?'; // Grab the link
        $regex .= '\s*(.*?)\s*>\s*'; // Any attributes or spaces that may or may not exist before closing tag
        $regex .= '(?P<name>\S+)'; // Grab the name
        $regex .= '\s*<\/a>)/i'; // Any number of spaces between the closing anchor tag (case insensitive)

        if (is_array($data)) {
            // This is what will replace the link (modify to you liking)
            $data = "{$data['name']}({$data['link']})";
        }
        return preg_replace_callback($regex, array('self', 'replaceAnchorsWithText'), $data);
    }

回答by nandocurty

use str_replace

使用 str_replace