java 正则表达式从路径中去除所有目录名(保留文件名)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4838730/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regex to strip all directorynames from Path (leave filename)
提问by tzippy
I want to remove all directorynames from a path:
我想从路径中删除所有目录名:
Payload/brownie.app/Info.plist
should become
应该成为
Info.plist
What regex should I use or can I use replace() from String in java? thanks!
我应该使用什么正则表达式,或者我可以在 java 中使用 String 中的 replace() 吗?谢谢!
回答by Arnaud Le Blanc
Try with this:
试试这个:
new File("Payload/brownie.app/Info.plist").getName()
This returns the filename without directories.
这将返回没有目录的文件名。
Example:
例子:
String filename = new File("Payload/brownie.app/Info.plist").getName();
System.out.println(filename);
Outupt:
输出:
Info.plist
回答by dogbane
You don't need a regex. Just find the last slash and use substring:
你不需要正则表达式。只需找到最后一个斜杠并使用子字符串:
int index = path.lastIndexOf(File.separatorChar);
String name = path.substring(index+1);
or use:
或使用:
new File(path).getName();
回答by mmm
This covers all spectrums directories, trailing or starting slashes.
这涵盖了所有频谱目录,尾随或开始斜线。
All others here so far does not...
到目前为止,这里的所有其他人都没有......
public static String extractFilename(String path) {
java.util.regex.Pattern p = java.util.regex.Pattern.compile('^[/\\]?(?:.+[/\\]+?)?(.+?)[/\\]?$');
java.util.regex.Matcher matcher = p.matcher(path);
if ( matcher.find() ) {
return matcher.group(1);
}
return null;
}
Used:
用过的:
println extractFilename("data\\path/to/file/RandomFile.pdf")
println extractFilename("RandomFile.pdf")
println extractFilename("RandomFile.pdf/")
println extractFilename("data\\path/to/file/RandomFile.pdf/")
println extractFilename("/data\\path/to/file/RandomFile.pdf/")
println extractFilename("/data\\path/to/file/RandomFile.pdf")
println extractFilename("/RandomFile.pdf")
println extractFilename("/RandomFile.pdf/")
println extractFilename("/")
Prints
印刷
RandomFile.pdf
RandomFile.pdf
RandomFile.pdf
RandomFile.pdf
RandomFile.pdf
RandomFile.pdf
RandomFile.pdf
RandomFile.pdf
/
.......................................................................EDIT............................................................................
………………………………………………………………………………………………………………………………………………………… .....................编辑............................ …………………………………………………………………………………………………………………………………………………………………………………………………………
Explanation for Uday. It was actually pretty complicated one, and I am not sure I can argue for all of it today, but I will give it a try :)
乌代的解释。它实际上非常复杂,我不确定我今天是否可以为所有这些争论,但我会试一试:)
^[/\\]?(?:.+[/\\]+?)?(.+?)[/\\]?$
0: Entire regex
0:整个正则表达式
^
1: Starts with
1:开始于
[/\\]?
2: A forward slash or backward slash ( yes, four slashes for one, crazy! ). Once or not at all, so not required.
2:正斜杠或反斜杠(是的,四个斜杠一个,太疯狂了!)。一次或根本没有,所以不需要。
(?:.+[/\\]+?)?
3: This step is the complicated one. It is intended to skip everything but the last one that matches this exact pattern, a non capturing group (?:... were we are looking for any character several times, followed by one slash.
3:这一步比较复杂。它旨在跳过除最后一个匹配此确切模式的所有内容,一个非捕获组(?:...我们是否正在多次查找任何字符,然后是一个斜杠。
The group can be repeated many times, but it is non greedy. So it is saying do this, except until you match the following regex explained in 4.
该组可以重复多次,但它是非贪婪的。所以它说这样做,除非你匹配4中解释的以下正则表达式。
This entire piece though, is not required, because of the ? outside the parentheses. For instance, "/RandomFile.pdf/" will not generate a match here, and continue with 4.
但这整件作品并不是必需的,因为 ? 括号外。例如,"/RandomFile.pdf/" 不会在此处生成匹配项,并继续执行 4。
However, now I do find this a bit weird, since .+ is greedy, still it is looking forward to the slash for the match. It might be the nature of groups, that they are non-greedy or a bug in Java pattern syntax.
然而,现在我确实觉得这有点奇怪,因为 .+ 是贪婪的,它仍然期待匹配的斜线。可能是组的性质,它们是非贪婪的,或者是 Java 模式语法中的错误。
(.+?)[/\\]?$
4: Since the regex applies for all of the string, it also has to match up to the end. The previous match at 3 was non greedy, reluctant using +?, meaning it will only match as long as the regex after it doesn't also match. Our word is at the end $ is within the parentheses which may or may not end with a slash. I have chosen to return the root path as the file name if there is no filename, but just a slash, since it is also a filename ( directory name )
4:由于正则表达式适用于所有字符串,因此它也必须匹配到最后。之前在 3 处的匹配是非贪婪的,不愿意使用 +?,这意味着它只会匹配,只要它不匹配之后的正则表达式。我们的词在末尾 $ 位于括号内,可能以也可能不以斜杠结尾。如果没有文件名,我选择将根路径作为文件名返回,而只是一个斜杠,因为它也是一个文件名(目录名)
5: The parentheses is a capturing group, which is what we return at the end.
5:括号里是一个捕获组,就是我们最后返回的。
I hope this clarifies a bit.
我希望这能澄清一点。
回答by Johan Sj?berg
Use replace with regex, String name = directory.replaceAll(".*/","")
, simple as that.
使用 replace with regex, String name = directory.replaceAll(".*/","")
,就这么简单。
回答by Joe Fernandez
The previous answers are all simpler than using a full-blown regular expression. If you really want to use one, though, here's a regex pattern you could use: ".*/(.+)"
以前的答案都比使用成熟的正则表达式更简单。不过,如果你真的想使用一个,这里有一个你可以使用的正则表达式模式:“.*/(.+)”
Pattern p = Pattern.compile(".*/(.+)");
Matcher matcher = p.matcher("Payload/brownie.app/Info.plist");
if ( matcher.find() ) {
System.out.println("result: "+matcher.group(1));
}
As you can see from the other answers, this is more code than is strictly needed, but if you are doing more sophisticated pattern matching and string extraction then regular expressions are a good way to go.
正如您从其他答案中看到的那样,这比严格需要的代码要多,但是如果您要进行更复杂的模式匹配和字符串提取,那么正则表达式是一个不错的选择。
回答by codelahoma
If you're dealing with a file path that's been passed by a browser to a web server, you can't be sure if it'll be a DOS style path, Unix style, or just a filename without path. If you really want a RegEx, this should do it:
如果您正在处理由浏览器传递到 Web 服务器的文件路径,您无法确定它是 DOS 风格的路径、Unix 风格,还是只是一个没有路径的文件名。如果你真的想要一个正则表达式,应该这样做:
String path = "Payload/brownie.app/Info.plist";
String filename = path.replaceFirst("(^.*[/\\])?([^/\\]*)$","");
This will work whether there's a DOS, Unix, or absent path.
无论是否存在 DOS、Unix 或不存在的路径,这都将起作用。
It'd be more legible, though, to use substrings as dogbane suggests, but adding logic to check for both types of file separator (again, only if you're dealing with multi-platform input).
不过,像dogbane 建议的那样使用子字符串会更清晰,但添加逻辑来检查两种类型的文件分隔符(同样,仅当您处理多平台输入时)。