如何从 PHP 中的文本中删除空行?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/709669/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I remove blank lines from text in PHP?
提问by StoneHeart
I need to remove blank lines (with whitespace or absolutely blank) in PHP. I use this regular expression, but it does not work:
我需要在 PHP 中删除空行(带有空格或绝对空白)。我使用这个正则表达式,但它不起作用:
$str = ereg_replace('^[ \t]*$\r?\n', '', $str);
$str = preg_replace('^[ \t]*$\r?\n', '', $str);
I want a result of:
我想要一个结果:
blahblah
blahblah
adsa
sad asdasd
will:
将要:
blahblah
blahblah
adsa
sad asdasd
回答by Michael Wales
// New line is required to split non-blank lines
preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $string);
The above regular expression says:
上面的正则表达式说:
/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/
1st Capturing group (^[\r\n]*|[\r\n]+)
1st Alternative: ^[\r\n]*
^ assert position at start of the string
[\r\n]* match a single character present in the list below
Quantifier: Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\r matches a carriage return (ASCII 13)
\n matches a fine-feed (newline) character (ASCII 10)
2nd Alternative: [\r\n]+
[\r\n]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\r matches a carriage return (ASCII 13)
\n matches a fine-feed (newline) character (ASCII 10)
[\s\t]* match a single character present in the list below
Quantifier: Between zero and unlimited times, as many times as possible, giving back as needed [greedy]
\s match any white space character [\r\n\t\f ]
\tTab (ASCII 9)
[\r\n]+ match a single character present in the list below
Quantifier: Between one and unlimited times, as many times as possible, giving back as needed [greedy]
\r matches a carriage return (ASCII 13)
\n matches a fine-feed (newline) character (ASCII 10)
回答by Alan Moore
Your ereg-replace()solution is wrong because the ereg/eregimethods are deprecated. Your preg_replace()won't even compile, but if you add delimiters and set multiline mode, it will work fine:
您的ereg-replace()解决方案是错误的,因为这些ereg/eregi方法已被弃用。你preg_replace()甚至不会编译,但如果你添加分隔符并设置多行模式,它会正常工作:
$str = preg_replace('/^[ \t]*[\r\n]+/m', '', $str);
The mmodifier allows ^to match the beginning of a logical line rather than just the beginning of the whole string. The start-of-line anchor is necessary because without it the regex would match the newline at the end of every line, not just the blank ones. You don't need the end-of-line anchor ($) because you're actively matching the newline characters, but it doesn't hurt.
该m修改允许^一个逻辑行的开头匹配,而不是整个字符串的仅仅是个开始。行首锚点是必要的,因为没有它,正则表达式将匹配每一行末尾的换行符,而不仅仅是空白行。您不需要行尾锚点 ( $),因为您正在主动匹配换行符,但这并没有什么坏处。
The accepted answergets the job done, but it's more complicated than it needs to be. The regex has to match either the beginning of the string (^[\r\n]*, multiline mode not set) or at least one newline ([\r\n]+), followed by at least one newline ([\r\n]+). So, in the special case of a string that starts with one or more blank lines, they'll be replaced with oneblank line. I'm pretty sure that's not the desired outcome.
该接受的答案能够完成任务,但它的复杂得多,它需要的。正则表达式必须匹配字符串的开头(^[\r\n]*,未设置多行模式)或至少一个换行符 ( [\r\n]+),后跟至少一个换行符 ( [\r\n]+)。因此,在字符串以一个或多个空行开头的特殊情况下,它们将被替换为一个空行。我很确定这不是想要的结果。
But most of the time it replaces two or more consecutive newlines, along with any horizontal whitespace (spaces or tabs) that lies between them, with one linefeed. That's the intent, anyway. The author seems to expect \sto match just the space character (\x20), when in fact it matches any whitespace character. That's a very common mistake. The actual list varies from one regex flavor to the next, but at minimum you can expect \sto match whatever [ \t\f\r\n]matches.
但大多数情况下,它用一个换行符替换两个或多个连续的换行符,以及它们之间的任何水平空白(空格或制表符)。无论如何,这就是意图。作者似乎希望\s只匹配空格字符 ( \x20),而实际上它匹配任何空白字符。这是一个很常见的错误。实际列表从一种正则表达式风格到另一种风格不等,但至少您可以期望\s匹配任何[ \t\f\r\n]匹配项。
Actually, in PHP you have a better option:
实际上,在 PHP 中你有一个更好的选择:
$str = preg_replace('/^\h*\v+/m', '', $str);
\hmatches any horizontal whitespace character, and \vmatches vertical whitespace.
\h匹配任何水平空白字符,并\v匹配垂直空白字符。
回答by Ben
Just explode the lines of the text to an array, remove empty lines using array_filterand implode the array again.
只需将文本行array_filter分解为数组,使用删除空行并再次内爆数组。
$tmp = explode("\n", $str);
$tmp = array_filter($tmp);
$str = implode("\n", $tmp);
Or in one line:
或者在一行中:
$str = implode("\n", array_filter(explode("\n", $str)));
I don't know, but this is maybe faster than preg_replace.
我不知道,但这可能比preg_replace.
回答by Dan Power
The comment from Bythosfrom Jamie's link above worked for me:
/^\n+|^[\t\s]*\n+/m
I didn't want to strip all of the new lines, just the empty/whitespace ones. This does the trick!
我不想删除所有的新行,只是空的/空白的。这行得通!
回答by Nauman Tahir
回答by Jamie
Use this:
用这个:
$str = preg_replace('^\s+\r?\n$', '', $str);
回答by Paul
There is no need to overcomplicate things. This can be achieved with a simple short regular expression:
没有必要把事情复杂化。这可以通过一个简单的短正则表达式来实现:
$text = preg_replace("/(\R){2,}/", "", $text);
The (\R)matches all newlines.
The {2,}matches two or more occurrences.
The $1Uses the first backreference (platform specific EOL) as the replacement.
将(\R)所有换行符相匹配。
在{2,}两个或更多的事件相匹配。
在$1使用第一个反向引用(特定平台EOL)作为替代品。
回答by user12047752
<?php
function del_blanklines_in_array_q($ar){
$strip = array();
foreach($ar as $k => $v){
$ll = strlen($v);
while($ll--){
if(ord($v[$ll]) > 32){ //hex /0x20 int 32 ascii SPACE
$strip[] = $v; break;
}
}
}
return $strip;
}
function del_blanklines_in_file_q($in, $out){
// in filename, out filename
$strip = del_blanklines_in_array_q(file($in));
file_put_contents($out, $strip );
}
回答by mamal
$file = "file_name.txt";
$file_data = file_get_contents($file);
$file_data_after_remove_blank_line = preg_replace("/(^[\r\n]*|[\r\n]+)[\s\t]*[\r\n]+/", "\n", $file_data );
file_put_contents($file,$file_data_after_remove_blank_line);
回答by mpen
function trimblanklines($str) {
return preg_replace('`\A[ \t]*\r?\n|\r?\n[ \t]*\Z`','',$str);
}
This one only removes them from the beginning and end, not the middle (if anyone else was looking for this).
这个只从开头和结尾删除它们,而不是中间(如果其他人正在寻找这个)。

