Java Commons Lang StringUtils.replace 性能对比 String.replace

Question

提问by Evgeniy Dorofeev

When I compared performance of Apache's StringUtils.replace()vs String.replace()I was surprised to know that the former is about 4 times faster. I used Google's Caliper framework to measure performance. Here's my test

当我比较 ApacheStringUtils.replace()与Apache 的性能时，String.replace()我惊讶地发现前者快了大约 4 倍。我使用 Google 的 Caliper 框架来衡量性能。这是我的测试

public class Performance extends SimpleBenchmark {
    String s = "111222111222";

    public int timeM1(int n) {
        int res = 0;
        for (int x = 0; x < n; x++) {
            res += s.replace("111", "333").length();
        }
        return res;
    }

    public int timeM2(int n) {
        int res = 0;
        for (int x = 0; x < n; x++) {
            res += StringUtils.replace(s, "111", "333", -1).length();
        }
        return res;
    }

    public static void main(String... args) {
        Runner.main(Performance.class, args);
    }
}

output

输出

 0% Scenario{vm=java, trial=0, benchmark=M1} 9820,93 ns; ?=1053,91 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=M2} 2594,67 ns; ?=58,12 ns @ 10 trials

benchmark   us linear runtime
       M1 9,82 ==============================
       M2 2,59 =======

Why is that? Both methods seem to do the same work, StringUtils.replace()is even more flexible.

这是为什么？两种方法似乎都做同样的工作，StringUtils.replace()甚至更加灵活。

Answer 1

采纳答案by nhahtdh

From the source code of java.lang.String¹:

从¹的源代码：java.lang.String

public String replace(CharSequence target, CharSequence replacement) {
   return Pattern
            .compile(target.toString(), Pattern.LITERAL)
            .matcher(this )
            .replaceAll(
                    Matcher.quoteReplacement(replacement.toString()));
}

String.replace(CharSequence target, CharSequence replacement)is implemented with java.util.regex.Pattern, therefore, it is not surprising that it is slower that StringUtils.replace(String text, String searchString, String replacement)², which is implemented with indexOfand StringBuffer.

String.replace(CharSequence target, CharSequence replacement)是用实现的java.util.regex.Pattern，因此，它比用和实现的²慢也就不足为奇了。StringUtils.replace(String text, String searchString, String replacement)indexOfStringBuffer

public static String replace(String text, String searchString, String replacement) {
    return replace(text, searchString, replacement, -1);
}

public static String replace(String text, String searchString, String replacement, int max) {
    if (isEmpty(text) || isEmpty(searchString) || replacement == null || max == 0) {
        return text;
    }
    int start = 0;
    int end = text.indexOf(searchString, start);
    if (end == -1) {
        return text;
    }
    int replLength = searchString.length();
    int increase = replacement.length() - replLength;
    increase = (increase < 0 ? 0 : increase);
    increase *= (max < 0 ? 16 : (max > 64 ? 64 : max));
    StringBuffer buf = new StringBuffer(text.length() + increase);
    while (end != -1) {
        buf.append(text.substring(start, end)).append(replacement);
        start = end + replLength;
        if (--max == 0) {
            break;
        }
        end = text.indexOf(searchString, start);
    }
    buf.append(text.substring(start));
    return buf.toString();
}

Footnote

脚注

¹The version that I links to and copied source code from is JDK 7

¹我链接到并从中复制源代码的版本是 JDK 7

²The version that I links to and copied source code from is common-lang-2.5

²我链接到并从中复制源代码的版本是 common-lang-2.5

Answer 2

回答by Stephen C

Why is that? Both methods seem to do the same work.

这是为什么？两种方法似乎都做同样的工作。

You would need to look at the source-code and do some serious investigation with a profiler to get a good (technical) answer to that.

您需要查看源代码并使用分析器进行一些认真的调查，以获得一个好的（技术）答案。

However, one possible explanation is that StringUtils.replaceand String.replacehave been tuned for different use-cases. You are only looking at one case ... with a pretty small string, and a replacement string that is the same size as the substring being replaced.

然而，一个可能的解释是，StringUtils.replace和String.replace已经调整为不同的使用情况。您只查看一种情况......带有一个非常小的字符串，以及一个与被替换的子字符串大小相同的替换字符串。

Another possible explanation is that the Apache developers simply spent more time on tuning. (And lets not blame the Java developers for that. They have been working under severe staffing constraints for a long time. In the big scheme of things, there are many tasks more important than performance tuning String.replace.)

另一种可能的解释是 Apache 开发人员只是在调优上花费了更多时间。（不要为此责怪 Java 开发人员。他们长期以来一直在严格的人员配备限制下工作。在大计划中，有许多任务比性能调优更重要String.replace。）

In fact, looking at the source code, it looks like the Java 7 version just uses the regular expression-based replaceunder the hood. By contrast, the Apache version is going to considerable lengths to avoid copying. Based on that evidence, I'd expect the performance difference between the two versions to be relatively smaller for large target strings. And I suspect the Java 7 version might even be better in some cases.

事实上，从源代码来看，Java 7 版本似乎只是在底层使用了基于正则表达式的方式replace。相比之下，Apache 版本将竭尽全力避免复制。基于该证据，我预计两个版本之间的性能差异对于大型目标字符串来说相对较小。而且我怀疑在某些情况下 Java 7 版本甚至可能更好。

(Either non-technical explanation is plausible too, based on the evidence in the code.)

（基于代码中的证据，非技术性解释也是合理的。）

Answer 3

回答by loukili

Try this one, you'll notice that it's extremely performant than Apache's one:

试试这个，你会发现它的性能比 Apache 的要好得多：

public static String replace (String source, String os, String ns) {
    if (source == null) {
        return null;
    }
    int i = 0;
    if ((i = source.indexOf(os, i)) >= 0) {
        char[] sourceArray = source.toCharArray();
        char[] nsArray = ns.toCharArray();
        int oLength = os.length();
        StringBuilder buf = new StringBuilder (sourceArray.length);
        buf.append (sourceArray, 0, i).append(nsArray);
        i += oLength;
        int j = i;
        // Replace all remaining instances of oldString with newString.
        while ((i = source.indexOf(os, i)) > 0) {
            buf.append (sourceArray, j, i - j).append(nsArray);
            i += oLength;
            j = i;
        }
        buf.append (sourceArray, j, sourceArray.length - j);
        source = buf.toString();
        buf.setLength (0);
    }
    return source;
}

Answer 4

回答by qxo

on my test with JMH:https://github.com/qxo/Benchmark4StringReplaceThe beset is loukili's way:

在我与 JMH 的测试中：https: //github.com/qxo/Benchmark4StringReplace困扰是 loukili 的方式：

java -jar target/benchmarks.jar StringReplaceBenchmark -wi 3 -i 6 -f 1 -tu msBenchmark Mode Cnt Score Error Units StringReplaceBenchmark.test4String thrpt 6 1255.017 ± 230.012 ops/ms StringReplaceBenchmark.test4StringUtils thrpt 6 4068.229 ± 67.708 ops/ms StringReplaceBenchmark.test4fast thrpt 6 4821.035 ± 97.790 ops/ms StringReplaceBenchmark.test4lang3StringUtils thrpt 6 3186.007 ± 102.786 ops/ms

Answer 5

回答by Tagir Valeev

In modern Java, this is not the case anymore. String.replacewas improved in Java-9moving from regular expression to StringBuilder, and was improved even more in Java-13moving to direct allocation of the target byte[]array calculating its exact size in advance. Thanks to internal JDK features used, like the ability to allocate an uninitialized array, ability to access String coder and ability to use private Stringconstructor which avoids copying, it's unlikely that current implementation can be beaten by a third-party implementation.

在现代 Java 中，情况不再如此。String.replace在Java-9 中从正则表达式到 StringBuilder 进行了改进，在Java-13 中进行了更多改进，以直接分配目标byte[]数组提前计算其确切大小。由于使用了内部 JDK 特性，例如分配未初始化数组的能力、访问 String 编码器的能力以及使用String避免复制的私有构造函数的能力，当前实现不太可能被第三方实现击败。

Here are my benchmarking results for your test using JDK 8, JDK 9 and JDK 13 (caliper:0.5-rc1; commons-lang3:3.9)

以下是我使用 JDK 8、JDK 9 和 JDK 13 测试的基准测试结果（caliper:0.5-rc1；commons-lang3:3.9）

Java 8 (4x slower indeed):

Java 8（确实慢了 4 倍）：

 0% Scenario{vm=java, trial=0, benchmark=M1} 291.42 ns; σ=6.56 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=M2} 70.34 ns; σ=0.15 ns @ 3 trials

benchmark    ns linear runtime
       M1 291.4 ==============================
       M2  70.3 =======

Java 9 (almost equal performance):

Java 9（几乎相同的性能）：

 0% Scenario{vm=java, trial=0, benchmark=M2} 99,15 ns; σ=8,34 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=M1} 103,43 ns; σ=9,01 ns @ 10 trials

benchmark    ns linear runtime
       M2  99,1 ============================
       M1 103,4 ==============================

Java 13 (standard method is 38% faster):

Java 13（标准方法快 38%）：

 0% Scenario{vm=java, trial=0, benchmark=M2} 91,64 ns; σ=5,12 ns @ 10 trials
50% Scenario{vm=java, trial=0, benchmark=M1} 57,38 ns; σ=2,51 ns @ 10 trials

benchmark   ns linear runtime
       M2 91,6 ==============================
       M1 57,4 ==================

Java Commons Lang StringUtils.replace 性能对比 String.replace

提问by Evgeniy Dorofeev

采纳答案by nhahtdh

Footnote

脚注

回答by Stephen C

回答by loukili

回答by qxo

回答by Tagir Valeev

相关推荐

最近更新

标签

Java Commons Lang StringUtils.replace 性能对比 String.replace

提问by Evgeniy Dorofeev

采纳答案by nhahtdh

Footnote

脚注

回答by Stephen C

回答by loukili

回答by qxo

回答by Tagir Valeev

相关推荐

将 Java Number 转换为 BigDecimal：最好的方法

如何避免Java中的“重复类”

Java Android：在另一个活动中停止服务

Java Jersey 2.0 的依赖注入

相关推荐

最近更新

标签