使用正则表达式匹配字符串的开头和结尾 [Java]
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18523822/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using regex to match beginning and end of string [Java]
提问by user2303741
I have a list of files in a folder:
我有一个文件夹中的文件列表:
maze1.in.txt
maze2.in.txt
maze3.in.txt
I've used substring to remove the .txt extensions. How do I use regex to match the front and the back of the file name? I need it to match "maze" at the front and ".in" at the back, and the middle must be a digit (can be single or double digit).
我使用 substring 删除了 .txt 扩展名。如何使用正则表达式匹配文件名的前后?我需要它匹配前面的“迷宫”和后面的“.in”,中间必须是一个数字(可以是一位数或两位数)。
I've tried the following
我试过以下
if (name.matches("name\din")) {
//dosomething
}
It doesn't match anything. What is the correct regex expression to use?
它不匹配任何东西。要使用的正确正则表达式是什么?
回答by Hunter McMillen
It is always good to think of what you are trying to do in english, before you create regular expressions.
在创建正则表达式之前,最好先考虑用英语尝试做什么。
You want to match a word maze
followed by a digit, followed by a literal period .
followed by another word.
您想匹配一个单词,maze
后跟一个数字,然后是一个文字句点,.
然后是另一个单词。
word `\w` matches a word character
digit `\d` matches a single digit
period `\.` matches a literal period
word `\w` matches a word character
putting it all together into a single string you get (keep in mind the double backslash for the Java escape and the pluses to repeat the previous match one or more times):
把它们放在一个你得到的字符串中(记住 Java 转义的双反斜杠和重复一次或多次前一个匹配的加号):
"\w+\d\.\w+"
The above is the generic case for anyfile name in the format xxx1.yyy
, if you wanted to match maze
and in
specifically, you can just add those in as literal strings.
以上是格式中任何文件名的通用情况xxx1.yyy
,如果您想匹配maze
,in
特别是,您可以将它们添加为文字字符串。
"maze\d+\.in"
example: http://ideone.com/rS7tw1
回答by Alex W
You need regex anchors that tell the regex to
您需要告诉正则表达式的正则表达式锚点
start at the beginning: ^
从头开始: ^
and signal the end of the string: $
并发出字符串结束的信号: $
^maze[\d]{0,2}\.in$
or in Java:
或在 Java 中:
name.matches("^maze[\d]{0,2}\.in$");
Also, your regex wasn't matching strings with a dot (.
) which would not accept your examples given. You need to add \.
to the regex to accept dots because .
is a special character.
此外,您的正则表达式没有将字符串与.
不接受您给出的示例的点 ( )匹配。您需要添加\.
到正则表达式以接受点,因为它.
是一个特殊字符。
回答by very9527
name.matches("^maze[0-9]+\.in\.txt$")
回答by CodeHelp
Your original solution doesn't work because string "name" is not in your text. It is "maze".
You can try this
您的原始解决方案不起作用,因为字符串“名称”不在您的文本中。是“迷宫”。
你可以试试这个
name.matches("maze\d{1,2}\.in")
d{1,2} is used to match a digit(can be single or double digit).
d{1,2} 用于匹配一个数字(可以是一位或两位)。
回答by progrenhard
I'm a little confused what you are asking for in particular
我有点困惑你特别要求什么
^(maze[0-9]*\.in)$
This will match maze(any number).in
这将匹配迷宫(任何数字)。
^(maze[0-9]*\.in)\.txt$
this will match maze(any number).in.txt -- excludes the .txt NO NEED FOR USING SUB STRING!
这将匹配 maze(any number).in.txt -- 排除 .txt NO NEED FOR USING SUB STRING!
The think i would be wary about as of right now is the capture groups... I'm not particularly sure what you are doing with this regex. However, I believe explaining capture groups could benefit you.
我现在要警惕的是捕获组......我不太确定你在用这个正则表达式做什么。但是,我相信解释捕获组可以使您受益。
A capture group for instance is denoted by () this is basically store them in the pattern array and is a way to parse stuff.
例如,捕获组由 () 表示,这基本上将它们存储在模式数组中,并且是一种解析内容的方法。
example maze1.in.txt
示例 maze1.in.txt
So if you want to capture the entire line minus .txt i would use this ^(maze[0-9]*\.in\.txt)$
所以如果你想捕获整行减去 .txt 我会用这个 ^(maze[0-9]*\.in\.txt)$
However, if I wanted to capture things separately I would do this ^(maze)([0-9]*)(\.in)\.txt$
this will exclude .txt but include maze, the number, and .in IN separate indexes of the pattern array.
但是,如果我想单独捕获事物,我会这样做,这^(maze)([0-9]*)(\.in)\.txt$
将排除 .txt 但包括迷宫、数字和 .in IN 模式数组的单独索引。