java使用正则表达式从字符串中删除模式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/31774415/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
java remove a pattern from string using regex
提问by D.Shefer
I need to clear my string from the following substrings:
我需要从以下子字符串中清除我的字符串:
\n
\n
\uXXXX
(X
being a digit or a character)
\uXXXX
(X
作为数字或字符)
e.g. "OR\n\nThe Central Site Engineering\u2019s \u201cfrontend\u201d, where developers turn to"
例如 "OR\n\nThe Central Site Engineering\u2019s \u201cfrontend\u201d, where developers turn to"
-> "OR The Central Site Engineering frontend , where developers turn to"
I tried using the String method replaceAll but dnt know how to overcome the \uXXXX issue as well as it didnt work for the \n
->"OR The Central Site Engineering frontend , where developers turn to"
我尝试使用 String 方法 replaceAll 但不知道如何克服 \uXXXX 问题以及它对 \n 不起作用
String s = "\n";
data=data.replaceAll(s," ");
how does this regex looks in java?
这个正则表达式在java中看起来如何?
thanks for the help
谢谢您的帮助
回答by Roel Strolenberg
Best to do this in 2 parts I guess:
我想最好分两部分来做:
String ex = "OR\n\nThe Central Site Engineering\u2019s \u201cfrontend\u201d, where developers turn to";
String part1 = ex.replaceAll("\\n"," "); // The firs \ replaces the backslah, \n replaces the n.
String part2 = part1.replaceAll("u\d\d\d\d","");
System.out.println(part2);
Try it =)
试试吧 =)
回答by Pshemo
Problem with string.replaceAll("\\n", " ");
is that replaceAll
expects regular expression, and \
in regex is special character used for instance to create character classes like \d
which represents digits, or to escape regex special characters like +
.
问题string.replaceAll("\\n", " ");
在于replaceAll
需要正则表达式,而\
在正则表达式中是特殊字符,用于创建字符类,例如\d
代表数字,或转义正则表达式特殊字符,例如+
.
So if you want to match \
in Javas regex you need to escape it twice:
因此,如果您想\
在 Javas regex 中匹配,则需要将其转义两次:
- once in regex
\\
- and once in String
"\\\\"
.
- 一次在正则表达式中
\\
- 并且一次在 String 中
"\\\\"
。
like replaceAll("\\\\n"," ")
.
喜欢replaceAll("\\\\n"," ")
。
You can also let regex engine do escaping for you and use replace
method like
您还可以让正则表达式引擎为您进行转义并使用replace
类似的方法
replace("\\n"," ")
replace("\\n"," ")
Now to remove \uXXXX
we can use
现在删除\uXXXX
我们可以使用
replaceAll("\\\\u[0-9a-fA-F]{4}","")
replaceAll("\\\\u[0-9a-fA-F]{4}","")
Also remember that Strings are immutable, so each str.replace..
call doesn't affect str
value, but it creates new String. So if you want to store that new string in str
you will need to use
还要记住,字符串是不可变的,所以每次str.replace..
调用都不会影响str
值,但它会创建新的字符串。因此,如果您想存储该新字符串,str
则需要使用
str = str.replace(..)
So your solution can look like
所以你的解决方案看起来像
String text = "\"OR\n\nThe Central Site Engineering\u2019s \u201cfrontend\u201d, where developers turn to\"";
text = text.replaceAll("(\\n)+"," ")
.replaceAll("\\u[0-9A-Ha-h]{4}", "");