java 空字符串的模式是什么?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3342298/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the pattern for empty string?
提问by Roman
I need to validate input: valid variants are either number or empty string. What is the correspondent regular expression?
我需要验证输入:有效的变体是数字或空字符串。对应的正则表达式是什么?
String pattern = "\d+|<what shoudl be here?>";
UPD: dont suggest "\d*" please, I'm just curious how to tell "empty string" in regexp.
UPD:请不要建议“\d*”,我只是好奇如何在正则表达式中告诉“空字符串”。
回答by polygenelubricants
In this particular case, ^\d*$would work, but generally speaking, to match patternor an empty string, you can use:
在这种特殊情况下,^\d*$会起作用,但一般来说,要匹配pattern或空字符串,您可以使用:
^$|pattern
Explanation
解释
^and$are the beginning and end of the string anchors respectively.|is used to denote alternates, e.g.this|that.
^和$分别是字符串锚点的开始和结束。|用于表示替代,例如this|that。
References
参考
Related questions
相关问题
Note on multiline mode
多行模式注意事项
In the so-called multiline mode (Pattern.MULTILINE/(?m)in Java), the ^and $match the beginning and end of the lineinstead. The anchors for the beginning and end of the string are now \Aand \Zrespectively.
在所谓多行模式(Pattern.MULTILINE/(?m)在Java中)时,^与$匹配的开始和结束线来代替。字符串开头和结尾的锚点现在\A和\Z分别是。
If you're in multiline mode, then the empty string is matched by \A\Zinstead. ^$would match an empty line within the string.
如果您处于多行模式,则匹配空字符串\A\Z。^$将匹配字符串中的空行。
Examples
例子
Here are some examples to illustrate the above points:
以下是一些示例来说明上述几点:
String numbers = "012345";
System.out.println(numbers.replaceAll(".", "</^\d*$/
>"));
// <0><1><2><3><4><5>
System.out.println(numbers.replaceAll("^.", "<\d+|\d{0}
>"));
// <0>12345
System.out.println(numbers.replaceAll(".$", "<echo 9023 | grep -E "(1|90)?23"
perl -e "print 'PASS' if (qq(23) =~ /(1|90)?23/)"
python -c "import re; print bool(re.match('^(1|90)?23$', '23'))"
>"));
// 01234<5>
numbers = "012\n345\n678";
System.out.println(numbers.replaceAll("^.", "<^(pattern)?$
^^ ^^^
>"));
// <0>12
// 345
// 678
System.out.println(numbers.replaceAll("(?m)^.", "<^(?:pattern)?$
>"));
// <0>12
// <3>45
// <6>78
System.out.println(numbers.replaceAll("(?m).\Z", "<String pattern = "(?:\d+)?";
>"));
// 012
// 345
// 67<8>
Note on Java matches
Java注意事项 matches
In Java, matchesattempts to match a pattern against the entire string.
在 Java 中,matches尝试将模式与整个 string 匹配。
This is true for String.matches, Pattern.matchesand Matcher.matches.
对于String.matches,Pattern.matches和都是如此Matcher.matches。
This means that sometimes, anchors can be omitted for Java matcheswhen they're otherwise necessary for other flavors and/or other Java regex methods.
这意味着有时,当锚点matches对于其他风格和/或其他 Java 正则表达式方法是必需的时,可以省略 Java 的锚点。
Related questions
相关问题
回答by KaptajnKold
Matches 0 or more digits with nothing before or after.
匹配 0 个或多个数字,之前或之后没有任何内容。
Explanation:
解释:
The '^' means start of line. '$' means end of line. '*' matches 0 or more occurences. So the pattern matches an entire line with 0 or more digits.
'^' 表示行首。'$' 表示行尾。'*' 匹配 0 次或多次出现。因此该模式匹配具有 0 个或多个数字的整行。
回答by Tim Pietzcker
To explicitly match the empty string, use \A\Z.
要显式匹配空字符串,请使用\A\Z.
You can also often see ^$which works fine unless the option is set to allow the ^and $anchors to match not only at the start or end of the string but also at the start/end of each line. If your input can never contain newlines, then of course ^$is perfectly OK.
您还可以经常看到^$哪个工作正常,除非该选项设置为允许^和$锚不仅在字符串的开头或结尾匹配,而且还可以在每行的开头/结尾匹配。如果您的输入永远不能包含换行符,那么当然^$是完全可以的。
Some regex flavors don't support \Aand \Zanchors (especially JavaScript).
某些正则表达式风格不支持\A和\Z锚点(尤其是 JavaScript)。
If you want to allow "empty" as in "nothing or only whitespace", then go for \A\s*\Zor ^\s*$.
如果你想在“没有或只有空白”中允许“空”,那么选择\A\s*\Zor ^\s*$。
回答by unbeli
Just as a funny solution, you can do:
作为一个有趣的解决方案,您可以这样做:
##代码##A digit, zero times. Yes, it does work.
一个数字,零次。是的,它确实有效。
回答by bruziuz
One of the way to view at the set of regular language as the closure of the below things:
将常规语言视为以下事物的闭包的一种方式:
- Special < EMPTY_STRING > is the regular language
- Any symbol from alphaphet is the valid regular language
- Any concatentation and union of two valid regexps is the regular language
- Any union of two valid regular language is the regular language
- Any transitive closure of the regexp is the regular language
- 特殊 <EMPTY_STRING> 是常规语言
- 来自 alphaphet 的任何符号都是有效的常规语言
- 两个有效正则表达式的任何串联和联合都是正则语言
- 任何两种有效正则语言的联合是正则语言
- 正则表达式的任何传递闭包都是正则语言
Concreate regular language is concrete element of this closure.
Concreate 正则语言是这个闭包的具体元素。
I didn't find empty symbol in POSIX standardto express regular language idea from step (1).
我没有在POSIX 标准中找到空符号来表达步骤 (1) 中的常规语言思想。
But it is exist extra thing like question mark there which is by posix definition is the following:
但是它存在额外的东西,如问号,由 posix 定义如下:
(regexp|< EMPTY_STRING >)
(正则表达式|< EMPTY_STRING >)
So you can do in the following manner for bash, perl, and python:
因此,您可以对 bash、perl 和 python 执行以下操作:
##代码##回答by Wiktor Stribi?ew
To make any pattern that matches an entire string optional, i.e. allow a pattern match an empty string, use an optional group:
要使匹配整个字符串的任何模式可选,即允许模式匹配空字符串,请使用可选组:
##代码##See the regex demo
查看正则表达式演示
If the regex engine allows (as in Java), prefer a non-capturing group since its main purpose is to only group subpatterns, not keep the subvalues captured:
如果正则表达式引擎允许(如在 Java 中),则更喜欢非捕获组,因为其主要目的是仅对子模式进行分组,而不是保留捕获的子值:
##代码##The ^will match the start of a string (or \Acan be used in many flavors for this), $will match the end of string (or \zcan be used to match the very end in many flavors, and Java, too), and the (....)?will match 1 or 0 (due to the ?quantifier) sequencesof the subpatterns inside parentheses.
在^将匹配字符串的开头(或\A可在许多种可以用于此),$将匹配字符串的结尾(或\z可用于许多种最末端匹配,以及Java,太),以及(....)?将匹配括号内的子模式的1 或 0(由于?量词)序列。
A Java usage note: when used in matches(), the initial ^and trailing $can be omitted and you can use
一个Java使用说明:in使用时matches(),开头^和结尾$可以省略,可以使用
回答by umop
There shouldn't be anything wrong with just "\d+|"
应该没有什么问题只是 "\d+|"

