php php正则表达式删除HTML

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/758806/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-24 23:46:54  来源:igfitidea点击:

php regex to remove HTML

phphtmlregex

提问by Me1000

Before we start, strip_tags() doesn't work.

在我们开始之前,strip_tags() 不起作用。

now,

现在,

I've got some data that needs to be parsed, the problem is, I need to get rid of all the HTML that has been formated very strangely. the tags look like this: (notice the spaces)

我有一些数据需要解析,问题是,我需要摆脱所有格式非常奇怪的 HTML。标签看起来像这样:(注意空格)

< p > blah blah blah < / p > < a href= " link.html " > blah blah blah < /a >

All the regexs I've been trying aren't working, and I don't know enough about regex formating to make them work. I don't care about preserving anything inside of the tags, and would prefer to get rid of the text inside a link if I could.

我一直在尝试的所有正则表达式都不起作用,而且我对正则表达式格式的了解不足以使它们工作。我不在乎保留标签内的任何内容,如果可以的话,我宁愿去掉链接内的文本。

Anyone have any idea?

任何人有任何想法?

(I really need to just sit down and learn regular expressions one day)

(我真的需要有一天坐下来学习正则表达式)

回答by chaos

Does

preg_replace('/<[^>]*>/', '', $content)

work?

工作?

回答by Slobodan

strip_tags()will work if you use html_entity_decode()on a variable before strip_tags()

strip_tags()如果您html_entity_decode()之前在变量上使用会起作用strip_tags()

<?php
$text = '< p > blah blah blah < / p > < a href= " link.html " > blah blah blah< /a >';
echo strip_tags(html_entity_decode($text));
?>

回答by strager

Solution which isn't fool-proof, but will work for what you posted:

不是万无一失的解决方案,但适用于您发布的内容:

s/<[^>]*>//g

回答by cletus

Formatted strangely? That is valid HTML though right? In that case I wouldn't touch it with regular expressions. Examples of how this can go wrong and why it's a bad idea are legion. Instead I'd use HTML Tidyon it to, for example, clean up unnecessary white-space.

格式奇怪?那是有效的 HTML,对吗?在那种情况下,我不会用正则表达式去碰它。这会如何出错以及为什么这是一个坏主意的例子很多。相反,我会在其上使用HTML Tidy,例如,清理不必要的空白。

回答by Ian

http://ca3.php.net/strip_tagsis probably what you need.

http://ca3.php.net/strip_tags可能正是您所需要的。

回答by Srikar Doddi

Try this out and let me know.

试试这个,让我知道。

<?php
$text = '< p > blah blah blah < / p > < a href= " link.html " > blah blah blah< /a >';
echo strip_tags($text);
echo "\n";
echo strip_tags($text, '<p><a>');
?>