php RegExp 去除 HTML 注释

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1084741/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 00:53:46  来源:igfitidea点击:

RegExp to strip HTML comments

phphtmlregex

提问by James Brooks

Looking for a regexp sequence of matches and replaces (preferably PHP but doesn't matter) to change this (the start and end is just random text that needs to be preserved).

寻找匹配和替换的正则表达式序列(最好是 PHP,但无关紧要)来改变这一点(开始和结束只是需要保留的随机文本)。

IN:

在:

fkdshfks khh fdsfsk 
<!--g1-->
<div class='codetop'>CODE: AutoIt</div>
<div class='geshimain'>
    <!--eg1-->
    <div class="autoit" style="font-family:monospace;">
        <span class="kw3">msgbox</span>
    </div>
    <!--gc2-->
    <!--bXNnYm94-->
    <!--egc2-->
    <!--g2-->
</div>
<!--eg2-->
fdsfdskh

to this OUT:

到这个 OUT:

fkdshfks khh fdsfsk 
<div class='codetop'>CODE: AutoIt</div>
<div class='geshimain'>
    <div class="autoit" style="font-family:monospace;">
        <span class="kw3">msgbox</span>
    </div>
</div>
fdsfdskh

Thanks.

谢谢。

回答by Paul Tomblin

Are you just trying to remove the comments? How about

你只是想删除评论吗?怎么样

s/<!--[^>]*-->//g

or the slightly better (suggested by the questioner himself):

或者稍微好一点(由提问者本人建议):

<!--(.*?)-->

But remember, HTML is notregular, so using regular expressions to parse it will lead you into a world of hurt when somebody throws bizarre edge cases at it.

但是请记住,HTML不是常规的,因此使用正则表达式来解析它会在有人向它抛出奇怪的边缘情况时将您带入一个受伤的世界。

回答by Benoit Villière

preg_replace('/<!--(.*)-->/Uis', '', $html)

This PHP code will remove all html comment tags from the $html string.

此 PHP 代码将从 $html 字符串中删除所有 html 注释标记。

回答by Eugen Mihailescu

A better version would be:

更好的版本是:

(?=<!--)([\s\S]*?)-->

It matches html comments like these:

它匹配这样的 html 注释:

<!--
multi line html comment
-->

or

或者

<!-- single line html comment -->

and what is most importantit matches comments like this (the other regex shown by others do not cover this situation):

什么是最重要的它匹配这样的评论(由他人所示的正则表达式等不包括这种情况):

<!-- this is my blog: <mynixworld.inf> -->

Note

笔记

Although syntactically the one below is a html comment your browser might parse it somehow differently and thus it might have a special meaning. Stripping such strings might break your code.

尽管从语法上看,下面的注释是 html 注释,但您的浏览器可能会以不同的方式解析它,因此它可能具有特殊含义。剥离此类字符串可能会破坏您的代码。

<!--[if !(IE 8) ]><!-->

回答by Pierre Wahlgren

Do not forget to consider conditional comments, as

不要忘记考虑条件注释,因为

<!--(.*?)-->

will remove them. Try this instead:

将删除它们。试试这个:

<!--[^\[](.*?)-->

This will also remove downlevel-revealed conditional comments, though.

不过,这也将删除下层显示的条件注释。

EDIT:

编辑:

This won't remove downlevel-revealed or downlevel-hidden comments.

这不会删除下级显示或下级隐藏的评论。

<!--(?!<!)[^\[>].*?-->

回答by James Brooks

Ah I've done it,

啊我已经做到了,

<!--(.*?)-->

回答by Hadrian

Try the following if your comments contain line breaks:

如果您的评论包含换行符,请尝试以下操作:

/<!--(.|\n)*?-->/g

回答by Toshinou Kyouko

<!--([\s\S]*?)-->

Works in javascript and VBScript also as "." doesn't match line breaks in all languages

在 javascript 和 VBScript 中也可以作为“.”使用。不匹配所有语言的换行符

回答by Alexandr Kondrashov

Here is my attempt:

这是我的尝试:

<!--(?!<!)[^\[>][\s\S]*?-->

This will also remove multi line comments and won't remove downlevel-revealed or downlevel-hidden comments.

这也将删除多行注释,并且不会删除下级显示或下级隐藏的注释。

回答by davlem

With next:

接下来:

/( )*<!--((.*)|[^<]*|[^!]*|[^-]*|[^>]*)-->\n*/g

Can remove multiline comments using test string:

可以使用测试字符串删除多行注释:

fkdshfks khh fdsfsk 
<!--g1-->
<div class='codetop'>CODE: AutoIt</div>
    <div class='geshimain'>
    <!--eg1-->
    <div class="autoit" style="font-family:monospace;">
        <span class="kw3">msgbox</span>
    </div>
    <!--gc2-->
    <!--bXNnYm94-->
    <!--egc2-->
    <!--g2-->
</div>
<!--eg2-->
fdsfdskh

<!-- --
> test
- -->

<!-- --
<- test <
>
- -->

<!--
test !<
- <!--
-->

<script type="text/javascript">//<![CDATA[
    var xxx = 'a';   
    //]]></script>

ok

回答by TurkiM

function remove_html_comments($html) {
   $expr = '/<!--[\s\S]*?-->/';
   $func = 'rhc';
   $html = preg_replace_callback($expr, $func, $html);
   return $html;
}

function rhc($search) {
   list($l) = $search;
   if (mb_eregi("\[if",$l) || mb_eregi("\[endif",$l) )  {
      return $l;
   }
}