Java String replaceAll() 与 Matcher replaceAll()(性能差异)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1466959/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
String replaceAll() vs. Matcher replaceAll() (Performance differences)
提问by Suvesh Pratapa
Pretty simple question, but this is coming from a C/C++ person getting into the intricacies of Java.
很简单的问题,但这是来自一个 C/C++ 人进入 Java 的复杂性。
I understand I can fire up jUnit and a few performance tests of my own to get an answer; but I'm just wondering if this is out there.
我知道我可以启动 jUnit 和一些我自己的性能测试来得到答案;但我只是想知道这是否在那里。
Are there known difference(s) between String.replaceAll() and Matcher.replaceAll() (On a Matcher Object created from a Regex.Pattern) in terms of performance?
String.replaceAll() 和 Matcher.replaceAll()(在从 Regex.Pattern 创建的匹配器对象上)在性能方面是否存在已知差异?
Also, what are the high-level API 'ish differences between the both? (Immutability, Handling NULLs, Handling empty strings, making coffee etc.)
另外,两者之间的高级 API 有什么区别?(不变性、处理 NULL、处理空字符串、煮咖啡等)
采纳答案by coobird
According to the documentation for String.replaceAll
, it has the following to say about calling the method:
根据 的文档String.replaceAll
,它有以下关于调用方法的说明:
An invocation of this method of the form
str.replaceAll(regex, repl)
yields exactly the same result as the expressionPattern.compile(regex).matcher(str).replaceAll(repl)
调用这种形式的方法
str.replaceAll(regex, repl)
产生与表达式完全相同的结果Pattern.compile(regex).matcher(str).replaceAll(repl)
Therefore, it can be expected the performance between invoking the String.replaceAll
, and explicitly creating a Matcher
and Pattern
should be the same.
因此,可以预期调用String.replaceAll
和显式创建Matcher
和之间的性能Pattern
应该是相同的。
Edit
编辑
As has been pointed out in the comments, the performance difference being non-existent would be true for a single call to replaceAll
from String
or Matcher
, however, if one needs to perform multiple calls to replaceAll
, one would expect it to be beneficial to hold onto a compiled Pattern
, so the relatively expensive regular expression pattern compilation does not have to be performed every time.
正如评论中指出的那样,对于replaceAll
fromString
或的单个调用Matcher
而言,不存在性能差异是正确的,但是,如果需要对 执行多次调用replaceAll
,人们会期望保留已编译的Pattern
,所以不需要每次都执行相对昂贵的正则表达式模式编译。
回答by Jon Skeet
The implementation of String.replaceAll
tells you everything you need to know:
的实现String.replaceAll
告诉你你需要知道的一切:
return Pattern.compile(regex).matcher(this).replaceAll(replacement);
(And the docs say the same thing.)
(而且文档也说了同样的话。)
While I haven't checked for caching, I'd certainly expect that compiling a pattern onceand keeping a static reference to that would be more efficient than calling Pattern.compile
with the same pattern each time. If there's a cache it'll be a small efficiency saving - if there isn't it could be a large one.
虽然我还没有检查缓存,我当然希望在编译模式一次,并保持静态引用,这将是比调用更有效地Pattern.compile
使用相同的模式各一次。如果有缓存,这将是一个小的效率节省 - 如果没有,它可能是一个大的。
回答by erickson
The main difference is that if you hold onto the Pattern
used to produce the Matcher
, you can avoid recompiling the regex every time you use it. Going through String
, you don't get the ability to "cache" like this.
主要区别在于,如果您保留Pattern
用于生成 的Matcher
,则可以避免每次使用时重新编译正则表达式。通过String
,您无法像这样“缓存”。
If you have a different regex every time, using the String
class's replaceAll
is fine. If you are applying the same regex to many strings, create one Pattern
and reuse it.
如果您每次都有不同的正则表达式,则使用String
类的replaceAll
很好。如果您将相同的正则表达式应用于多个字符串,请创建一个Pattern
并重复使用它。
回答by Michael Borgwardt
Source code of String.replaceAll()
:
的源代码String.replaceAll()
:
public String replaceAll(String regex, String replacement) {
return Pattern.compile(regex).matcher(this).replaceAll(replacement);
}
It has to compile the pattern first - if you're going to run it many times with the same pattern on short strings, performance will be much better if you reuse one compiled Pattern.
它必须首先编译模式——如果你要在短字符串上用相同的模式多次运行它,如果你重用一个编译的模式,性能会好得多。
回答by Jason S
Immutability / thread safety: compiled Patterns are immutable, Matchers are not. (see Is Java Regex Thread Safe?)
不变性/线程安全:编译模式是不可变的,匹配器不是。(请参阅Java Regex 线程安全吗?)
Handling empty strings: replaceAll should handle empty strings gracefully (it won't match an empty input string pattern)
处理空字符串:replaceAll 应该优雅地处理空字符串(它不会匹配空输入字符串模式)
Making coffee, etc.: last I heard, neither String nor Pattern nor Matcher had any API features for that.
煮咖啡等:上次我听说,String、Pattern 和 Matcher 都没有任何 API 功能。
edit: as for handling NULLs, the documentation for String and Pattern doesn't explicitly say so, but I suspect they'd throw a NullPointerException since they expect a String.
编辑:至于处理 NULL,String 和 Pattern 的文档没有明确说明,但我怀疑他们会抛出 NullPointerException,因为他们期望一个 String。
回答by Alan Moore
The difference is that String.replaceAll() compiles the regex each time it's called. There's no equivalent for .NET's static Regex.Replace() method, which automatically caches the compiled regex. Usually, replaceAll() is something you do only once, but if you're going to be calling it repeatedly with the same regex, especially in a loop, you should create a Pattern object and use the Matcher method.
不同之处在于 String.replaceAll() 每次调用时都会编译正则表达式。.NET 的静态 Regex.Replace() 方法没有等效项,它会自动缓存已编译的正则表达式。通常,replaceAll() 是您只执行一次的操作,但如果您要使用相同的正则表达式重复调用它,尤其是在循环中,您应该创建一个 Pattern 对象并使用 Matcher 方法。
You can create the Matcher ahead of time, too, and use its reset() method to retarget it for each use:
您也可以提前创建 Matcher,并使用其 reset() 方法为每次使用重新定位它:
Matcher m = Pattern.compile(regex).matcher("");
for (String s : targets)
{
System.out.println(m.reset(s).replaceAll(repl));
}
The performance benefit of reusing the Matcher, of course, is nowhere as great as that of reusing the Pattern.
当然,重用 Matcher 的性能优势远不及重用 Pattern。
回答by Indigenuity
The other answers sufficiently cover the performance part of the OP, but another difference between Matcher::replaceAll
and String::replaceAll
is also a reason to compile your own Pattern
. When you compile a Pattern
yourself, there are options like flags to modify how the regex is applied. For example:
其他答案足以涵盖 OP 的性能部分,但Matcher::replaceAll
和之间的另一个区别String::replaceAll
也是编译自己的Pattern
. 当您Pattern
自己编译时,可以使用标志等选项来修改正则表达式的应用方式。例如:
Pattern myPattern = Pattern.compile(myRegex, Pattern.CASE_INSENSITIVE);
The Matcher
will apply all the flags you set when you call Matcher::replaceAll
.
该Matcher
会将所有,当你打电话给你设置的标志Matcher::replaceAll
。
There are other flags you can set as well. Mostly I just wanted to point out that the Pattern
and Matcher
API has lots of options, and that's the primary reason to go beyond the simple String::replaceAll
您还可以设置其他标志。大多数情况下,我只想指出APIPattern
和Matcher
API 有很多选择,这是超越简单的主要原因String::replaceAll