Java 构建正则表达式模式以匹配句子
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20320719/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Constructing regex pattern to match sentence
提问by user1923
I'm trying to write a regex pattern that will match any sentence that begins with multiple or one tab and/or whitespace. For example, I want my regex pattern to be able to match " hello there I like regex!" but so I'm scratching my head on how to match words after "hello". So far I have this:
我正在尝试编写一个正则表达式模式,该模式将匹配以多个或一个制表符和/或空格开头的任何句子。例如,我希望我的正则表达式模式能够匹配“你好,我喜欢正则表达式!” 但是所以我在如何匹配“你好”之后的单词上摸不着头脑。到目前为止,我有这个:
String REGEX = "(?s)(\p{Blank}+)([a-z][ ])*";
Pattern PATTERN = Pattern.compile(REGEX);
Matcher m = PATTERN.matcher(" asdsada adf adfah.");
if (m.matches()) {
System.out.println("hurray!");
}
Any help would be appreciated. Thanks.
任何帮助,将不胜感激。谢谢。
采纳答案by Steve P.
String regex = "^\s+[A-Za-z,;'\"\s]+[.?!]$"
^
means "begins with"\\s
means white space+
means 1 or more[A-Za-z,;'"\\s]
means any letter, ,
, ;
, '
, "
, or whitespace character$
means "ends with"
^
表示“以”开头\\s
表示空格+
表示 1 个或多个[A-Za-z,;'"\\s]
表示任何字母,,
, ;
, '
, "
, 或 空格字符$
表示“以”结尾
回答by Taylor Hx
An example regex to match sentences by the definition: "A sentence is a series of characters, starting with at lease one whitespace character, that ends in one of .
, !
or ?
" is as follows:
根据定义匹配句子的示例正则表达式:“句子是一系列字符,以至少一个空格字符开头,以.
,!
或中的一个结尾?
”如下:
\s+[^.!?]*[.!?]
Note that newline characters will also be included in this match.
请注意,此匹配项中也将包含换行符。
回答by hwnd
Based upon what you desire and asked for, the following will work.
根据您的愿望和要求,以下将起作用。
String s = " hello there I like regex!";
Pattern p = Pattern.compile("^\s+[a-zA-Z\s]+[.?!]$");
Matcher m = p.matcher(s);
if (m.matches()) {
System.out.println("hurray!");
}
See working demo
回答by Ashish
If you looking to match all strings starting with a white space you can try using "^\s+*" regular expression.
如果您希望匹配所有以空格开头的字符串,您可以尝试使用 "^\s+*" 正则表达式。
This tool could help you to test your regular expression efficiently.
这个工具可以帮助你有效地测试你的正则表达式。
回答by Eloi Montanaro
String regex = "(?<=^|(\.|!|\?) |\n|\t|\r|\r\n) *\(?[A-Z][^.!?]*((\.|!|\?)(?! |\n|\r|\r\n)[^.!?]*)*(\.|!|\?)(?= |\n|\r|\r\n)"
This match any sentence following the definition 'a sentence start with a capital letter and end with a dot'.
这与定义“以大写字母开头并以点结尾的句子”定义之后的任何句子相匹配。