php 如何使用 preg_replace 替换多行文本

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2101409/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 05:02:02  来源:igfitidea点击:

How to replace text over multiple lines using preg_replace

php

提问by Elitmiar

Hi have the following content within an html page that stretches multiple lines

嗨,在延伸多行的 html 页面中有以下内容

<div class="c-fc c-bc" id="content">
                <span class="content-heading c-hc">Heading 1 </span><br />
                The Home Page must provide a introduction to the services provided.<br />
                <br />
                <span class="c-sc">Sub Heading</span><br />
                The Home Page must provide a introduction to the services provided.<br />
                <br />
                <span class="c-sc">Sub Heading</span><br /> 
                The Home Page must provide a introduction to the services provided.<br />
            </div>

I need to replace everthing between <div class="c-fc c-bc" id="content">and </div>with custom text

我需要更换之间的寄托都<div class="c-fc c-bc" id="content"></div>自定义文本

I use the following code to accomplish this but it does not want to work if it's multiple lines, but works if evertinh is in one line

我使用以下代码来完成此操作,但如果它是多行,则它不想工作,但如果 evertinh 在一行中,它就可以工作

$body = file_get_contents('../../templates/'.$val['url']);

$body = preg_replace('/<div class=\"c\-fc c\-bc\" id=\"content\">(.*)<\/div>/','<div class="c-fc c-bc" id="content">abc</div>',$body);

Am I missing something?

我错过了什么吗?

回答by Mark Byers

If this weren't HTML, I'd tell you to use the DOTALLmodifier to change the meaning of .from 'match everything except new line' to 'match everything':

如果这不是 HTML,我会告诉您使用DOTALL修饰符将.“匹配除新行之外的所有内容”的含义更改为“匹配所有内容”:

preg_replace('/(.*)<\/div>/s','abc',$body);

But this is HTML, so use an HTML parser instead.

但这是 HTML,因此请改用 HTML 解析器。

回答by Will Earp

it is the "s" flag, it enables . to capture newlines

它是“s”标志,它启用 . 捕获换行符

回答by Al.

It is possible to use regex to strip out chunks of html data, but you need to wrap the html with custom html tags which get ignored by browsers. For example:

可以使用正则表达式去除 html 数据块,但您需要使用自定义 html 标签包装 html,这些标签会被浏览器忽略。例如:

<?php
$html='
<div>This will be shown</div>
<custom650 rel="nofollow">
  <p class="subformedit">
    <a href="#" class="mylink">Link</a>
    <div class="morestuff">
      ... more html in here ...
    </div>
  </p>
</custom650>
<div>This will also be shown</div>
';

To strip the tags with the rel="nofollow" attributes, you can use the following regex:

要使用 rel="nofollow" 属性去除标签,您可以使用以下正则表达式:

$newhtml = preg_replace('/<([^\s]+)[^>]*rel="nofollow"[^>]*>.*?<\/>/si', '', $html);

From experience, start the custom tags on a new line. Undoubtedly a hack, but might help someone.

根据经验,在新行上开始自定义标签。毫无疑问是一个黑客,但可能会帮助某人。

回答by meistermuh

you can also use [\s\S]instead of .combined with the DOTALL flag sfor matching everyting because [\s\S]means exactly the same: match everything; \s matches all space-characters (including newline) and \S machtes everything that is not a space-character (i.e. everything else). in some cases/implementations of regular expressions, this works better than enabling DOTALL

您也可以使用[\s\S]代替.与 DOTALL 标志相结合s来匹配所有内容,因为[\s\S]意思完全相同:匹配所有内容;\s 匹配所有空格字符(包括换行符),\S 匹配所有不是空格字符的内容(即其他所有内容)。在某些情况/正则表达式的实现中,这比启用 DOTALL 效果更好

caution: .*with the flag for DOTALL as well as [\s\S]are both "hungry" and won't stop reading the string. if you want them to stop at a certain position, (e.g. the first </div>), use the non-greedy operator ?behind your quantifier, e.g. .*?

注意:.*带有 DOTALL 的标志以及[\s\S]都“饥饿”并且不会停止读取字符串。如果您希望它们停在某个位置(例如第一个</div>),?请在量词后面使用非贪婪运算符,例如.*?