Java Commons Lang StringUtils.replace 性能对比 String.replace
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16228992/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Commons Lang StringUtils.replace performance vs String.replace
提问by Evgeniy Dorofeev
When I compared performance of Apache's StringUtils.replace()
vs String.replace()
I was surprised to know that the former is about 4 times faster. I used Google's Caliper framework to measure performance. Here's my test
当我比较 ApacheStringUtils.replace()
与Apache 的性能时,String.replace()
我惊讶地发现前者快了大约 4 倍。我使用 Google 的 Caliper 框架来衡量性能。这是我的测试
public class Performance extends SimpleBenchmark {
String s = "111222111222";
public int timeM1(int n) {
int res = 0;
for (int x = 0; x < n; x++) {
res += s.replace("111", "333").length();
}
return res;
}
public int timeM2(int n) {
int res = 0;
for (int x = 0; x < n; x++) {
res += StringUtils.replace(s, "111", "333", -1).length();
}
return res;
}
public static void main(String... args) {
Runner.main(Performance.class, args);
}
}
output
输出
0% Scenario{vm=java, trial=0, benchmark=M1} 9820,93 ns; ?=1053,91 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=M2} 2594,67 ns; ?=58,12 ns @ 10 trials
benchmark us linear runtime
M1 9,82 ==============================
M2 2,59 =======
Why is that? Both methods seem to do the same work, StringUtils.replace()
is even more flexible.
这是为什么?两种方法似乎都做同样的工作,StringUtils.replace()
甚至更加灵活。
采纳答案by nhahtdh
From the source code of java.lang.String
1:
从1的源代码:java.lang.String
public String replace(CharSequence target, CharSequence replacement) {
return Pattern
.compile(target.toString(), Pattern.LITERAL)
.matcher(this )
.replaceAll(
Matcher.quoteReplacement(replacement.toString()));
}
String.replace(CharSequence target, CharSequence replacement)
is implemented with java.util.regex.Pattern
, therefore, it is not surprising that it is slower that StringUtils.replace(String text, String searchString, String replacement)
2, which is implemented with indexOf
and StringBuffer
.
String.replace(CharSequence target, CharSequence replacement)
是用 实现的java.util.regex.Pattern
,因此,它比用和实现的2慢也就不足为奇了。StringUtils.replace(String text, String searchString, String replacement)
indexOf
StringBuffer
public static String replace(String text, String searchString, String replacement) {
return replace(text, searchString, replacement, -1);
}
public static String replace(String text, String searchString, String replacement, int max) {
if (isEmpty(text) || isEmpty(searchString) || replacement == null || max == 0) {
return text;
}
int start = 0;
int end = text.indexOf(searchString, start);
if (end == -1) {
return text;
}
int replLength = searchString.length();
int increase = replacement.length() - replLength;
increase = (increase < 0 ? 0 : increase);
increase *= (max < 0 ? 16 : (max > 64 ? 64 : max));
StringBuffer buf = new StringBuffer(text.length() + increase);
while (end != -1) {
buf.append(text.substring(start, end)).append(replacement);
start = end + replLength;
if (--max == 0) {
break;
}
end = text.indexOf(searchString, start);
}
buf.append(text.substring(start));
return buf.toString();
}
Footnote
脚注
1The version that I links to and copied source code from is JDK 7
1我链接到并从中复制源代码的版本是 JDK 7
2The version that I links to and copied source code from is common-lang-2.5
2我链接到并从中复制源代码的版本是 common-lang-2.5
回答by Stephen C
Why is that? Both methods seem to do the same work.
这是为什么?两种方法似乎都做同样的工作。
You would need to look at the source-code and do some serious investigation with a profiler to get a good (technical) answer to that.
您需要查看源代码并使用分析器进行一些认真的调查,以获得一个好的(技术)答案。
However, one possible explanation is that StringUtils.replace
and String.replace
have been tuned for different use-cases. You are only looking at one case ... with a pretty small string, and a replacement string that is the same size as the substring being replaced.
然而,一个可能的解释是,StringUtils.replace
和String.replace
已经调整为不同的使用情况。您只查看一种情况......带有一个非常小的字符串,以及一个与被替换的子字符串大小相同的替换字符串。
Another possible explanation is that the Apache developers simply spent more time on tuning. (And lets not blame the Java developers for that. They have been working under severe staffing constraints for a long time. In the big scheme of things, there are many tasks more important than performance tuning String.replace
.)
另一种可能的解释是 Apache 开发人员只是在调优上花费了更多时间。(不要为此责怪 Java 开发人员。他们长期以来一直在严格的人员配备限制下工作。在大计划中,有许多任务比性能调优更重要String.replace
。)
In fact, looking at the source code, it looks like the Java 7 version just uses the regular expression-based replace
under the hood. By contrast, the Apache version is going to considerable lengths to avoid copying. Based on that evidence, I'd expect the performance difference between the two versions to be relatively smaller for large target strings. And I suspect the Java 7 version might even be better in some cases.
事实上,从源代码来看,Java 7 版本似乎只是在底层使用了基于正则表达式的方式replace
。相比之下,Apache 版本将竭尽全力避免复制。基于该证据,我预计两个版本之间的性能差异对于大型目标字符串来说相对较小。而且我怀疑在某些情况下 Java 7 版本甚至可能更好。
(Either non-technical explanation is plausible too, based on the evidence in the code.)
(基于代码中的证据,非技术性解释也是合理的。)
回答by loukili
Try this one, you'll notice that it's extremely performant than Apache's one:
试试这个,你会发现它的性能比 Apache 的要好得多:
public static String replace (String source, String os, String ns) {
if (source == null) {
return null;
}
int i = 0;
if ((i = source.indexOf(os, i)) >= 0) {
char[] sourceArray = source.toCharArray();
char[] nsArray = ns.toCharArray();
int oLength = os.length();
StringBuilder buf = new StringBuilder (sourceArray.length);
buf.append (sourceArray, 0, i).append(nsArray);
i += oLength;
int j = i;
// Replace all remaining instances of oldString with newString.
while ((i = source.indexOf(os, i)) > 0) {
buf.append (sourceArray, j, i - j).append(nsArray);
i += oLength;
j = i;
}
buf.append (sourceArray, j, sourceArray.length - j);
source = buf.toString();
buf.setLength (0);
}
return source;
}
回答by qxo
on my test with JMH:https://github.com/qxo/Benchmark4StringReplaceThe beset is loukili's way:
在我与 JMH 的测试中:https: //github.com/qxo/Benchmark4StringReplace困扰是 loukili 的方式:
java -jar target/benchmarks.jar StringReplaceBenchmark -wi 3 -i 6 -f 1 -tu ms
Benchmark Mode Cnt Score Error Units
StringReplaceBenchmark.test4String thrpt 6 1255.017 ± 230.012 ops/ms
StringReplaceBenchmark.test4StringUtils thrpt 6 4068.229 ± 67.708 ops/ms
StringReplaceBenchmark.test4fast thrpt 6 4821.035 ± 97.790 ops/ms
StringReplaceBenchmark.test4lang3StringUtils thrpt 6 3186.007 ± 102.786 ops/ms
java -jar target/benchmarks.jar StringReplaceBenchmark -wi 3 -i 6 -f 1 -tu ms
Benchmark Mode Cnt Score Error Units
StringReplaceBenchmark.test4String thrpt 6 1255.017 ± 230.012 ops/ms
StringReplaceBenchmark.test4StringUtils thrpt 6 4068.229 ± 67.708 ops/ms
StringReplaceBenchmark.test4fast thrpt 6 4821.035 ± 97.790 ops/ms
StringReplaceBenchmark.test4lang3StringUtils thrpt 6 3186.007 ± 102.786 ops/ms
回答by Tagir Valeev
In modern Java, this is not the case anymore. String.replace
was improved in Java-9moving from regular expression to StringBuilder, and was improved even more in Java-13moving to direct allocation of the target byte[]
array calculating its exact size in advance. Thanks to internal JDK features used, like the ability to allocate an uninitialized array, ability to access String coder and ability to use private String
constructor which avoids copying, it's unlikely that current implementation can be beaten by a third-party implementation.
在现代 Java 中,情况不再如此。String.replace
在Java-9 中从正则表达式到 StringBuilder 进行了改进,在Java-13 中进行了更多改进,以直接分配目标byte[]
数组提前计算其确切大小。由于使用了内部 JDK 特性,例如分配未初始化数组的能力、访问 String 编码器的能力以及使用String
避免复制的私有构造函数的能力,当前实现不太可能被第三方实现击败。
Here are my benchmarking results for your test using JDK 8, JDK 9 and JDK 13 (caliper:0.5-rc1; commons-lang3:3.9)
以下是我使用 JDK 8、JDK 9 和 JDK 13 测试的基准测试结果(caliper:0.5-rc1;commons-lang3:3.9)
Java 8 (4x slower indeed):
Java 8(确实慢了 4 倍):
0% Scenario{vm=java, trial=0, benchmark=M1} 291.42 ns; σ=6.56 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=M2} 70.34 ns; σ=0.15 ns @ 3 trials
benchmark ns linear runtime
M1 291.4 ==============================
M2 70.3 =======
Java 9 (almost equal performance):
Java 9(几乎相同的性能):
0% Scenario{vm=java, trial=0, benchmark=M2} 99,15 ns; σ=8,34 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=M1} 103,43 ns; σ=9,01 ns @ 10 trials
benchmark ns linear runtime
M2 99,1 ============================
M1 103,4 ==============================
Java 13 (standard method is 38% faster):
Java 13(标准方法快 38%):
0% Scenario{vm=java, trial=0, benchmark=M2} 91,64 ns; σ=5,12 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=M1} 57,38 ns; σ=2,51 ns @ 10 trials
benchmark ns linear runtime
M2 91,6 ==============================
M1 57,4 ==================