java 从 String 中删除所有空格但保留一个换行符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15494780/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Remove all whitespaces from String but keep ONE newline
提问by friesoft
I have this input String (containg tabs, spaces, linebreaks):
我有这个输入字符串(包含制表符、空格、换行符):
That is a test.
seems to work pretty good? working.
Another test again.
[Edit]: I should have provided the String for better testing as stackoverflow removes all special characters (tabs, ...)
[编辑]:我应该提供字符串以进行更好的测试,因为 stackoverflow 删除了所有特殊字符(制表符,...)
String testContent = "\n\t\n\t\t\t\n\t\t\tDas ist ein Test.\t\t\t \n\tsoweit scheint das \t\tganze zu? funktionieren.\n\n\n\n\t\t\n\t\t\n\t\t\t \n\t\t\t \n \t\t\t\n \tNoch ein Test.\n \t\n \t\n \t";
And I want to reach this state:
我想达到这个状态:
That is a test.
seems to work pretty good? working.
Another test again.
String expectedOutput = "Das ist ein Test.\nsoweit scheint das ganze zu? funktionieren.\nNoch ein Test.\n";
Any ideas? Can this be achieved using regexes?
有任何想法吗?这可以使用正则表达式来实现吗?
replaceAll("\\s+", " ")
is NOT what I'm looking for. If this regex would preserve exactly 1 newline of the ones existing it would be perfect.
replaceAll("\\s+", " ")
不是我要找的。如果这个正则表达式可以保留现有的 1 个换行符,那将是完美的。
I have tried this but this seems suboptimal to me...:
我试过这个,但这对我来说似乎不是最理想的......:
BufferedReader bufReader = new BufferedReader(new StringReader(testContent));
String line = null;
StringBuilder newString = new StringBuilder();
while ((line = bufReader.readLine()) != null) {
String temp = line.replaceAll("\s+", " ");
if (!temp.trim().equals("")) {
newString.append(temp.trim());
newString.append("\n");
}
}
回答by Marko Topolnik
In a single regex (plus a small patch for tabs):
在单个正则表达式中(加上一个小标签补丁):
input.replaceAll("^\s+|\s+$|\s*(\n)\s*|(\s)\s*", "")
.replace("\t"," ");
The regex looks daunting, but in fact decomposes nicely into these parts that are OR-ed together:
正则表达式看起来令人生畏,但实际上可以很好地分解为 OR 运算在一起的这些部分:
^\s+
– match whitespace at the beginning;\s+$
– match whitespace at the end;\s*(\n)\s*
– match whitespace containing a newline, and capture that newline;(\s)\s*
– match whitespace, capturing the first whitespace character.
^\s+
– 匹配开头的空格;\s+$
– 匹配末尾的空格;\s*(\n)\s*
– 匹配包含换行符的空格,并捕获该换行符;(\s)\s*
– 匹配空格,捕获第一个空格字符。
The result will be a match with two capture groups, but only one of the groups may be non-empty at a time. This allows me to replace the match with "$1$2"
, which means "concatenate the two capture groups."
结果将与两个捕获组匹配,但一次可能只有一个组为非空。这允许我用 替换匹配"$1$2"
,这意味着“连接两个捕获组”。
The only remaining problem is that I can't replace a tab with a space using this approach, so I fix that up with a simple non-regex character replacement.
唯一剩下的问题是我无法使用这种方法用空格替换制表符,所以我用一个简单的非正则表达式字符替换来解决这个问题。
回答by MBO
In 4 steps:
分4步:
text
// 1. compress all non-newline whitespaces to single space
.replaceAll("[\s&&[^\n]]+", " ")
// 2. remove spaces from begining or end of lines
.replaceAll("(?m)^\s|\s$", "")
// 3. compress multiple newlines to single newlines
.replaceAll("\n+", "\n")
// 4. remove newlines from begining or end of string
.replaceAll("^\n|\n$", "")
回答by wds
If I understand correctly, you simply want to replace a succession of newlines with one newline. So replace \n\n*
with \n
(with appropriate flags). If there is a lot of whitespace in the lines, simply remove the whitespace (^\s\s*$
with multiline mode) first, then replace the newlines.
如果我理解正确,您只是想用一个换行符替换一系列换行符。所以替换\n\n*
为\n
(使用适当的标志)。如果行中有很多空格,只需先删除空格(^\s\s*$
使用多行模式),然后替换换行符。
Edit: The only issue here is that some newlines might remain here and there, so you have to be careful to first collapse spaces, then fix the empty line problem. You can trim it down further into probably a single regex, but it's easier to read with these three:
编辑:这里唯一的问题是一些换行符可能会保留在这里和那里,所以你必须小心先折叠空格,然后修复空行问题。您可以将其进一步精简为一个正则表达式,但使用以下三个更容易阅读:
Pattern spaces = Pattern.compile("[\t ]+");
Pattern emptyLines = Pattern.compile("^\s+$?", Pattern.MULTILINE);
Pattern newlines = Pattern.compile("\s*\n+");
System.out.print(
newlines.matcher(emptyLines.matcher(spaces.matcher(
input).replaceAll(" ")).replaceAll("")).replaceAll("\n"));
回答by denis.solonenko
Why don't you do
你为什么不做
String[] lines = split(s,"\n")
String[] noExtraSpaces = removeSpacesInEachLine(lines)
String result = join(noExtraSpaces,"\n")
回答by Maroun
First replace all new lineswith one new line, then replace the spacesbut not new lines, last thing, you should remove all white spaces from the beginning of the string:
首先用一个新行替换所有新行,然后替换空格而不是新行,最后一件事,您应该从字符串的开头删除所有空格:
String test = " This is a real\n\n\n\n\n\n\n\n\n test !!\n\n\n bye";
test = test.replaceAll("\n+", "\n");
test = test.replaceAll("((?!\n+)\s+)", " ");
test = test.replaceAll("((?!\n+)\s+)", "");
Output:
输出:
This is a real
test !!
bye