如果性能很重要,我应该使用 Java 的 String.format() 吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/513600/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 15:37:22  来源:igfitidea点击:

Should I use Java's String.format() if performance is important?

javastringperformancestring-formattingmicro-optimization

提问by Air

We have to build Strings all the time for log output and so on. Over the JDK versions we have learned when to use StringBuffer(many appends, thread safe) and StringBuilder(many appends, non-thread-safe).

我们必须一直为日志输出等构建字符串。在 JDK 版本中,我们已经学会了何时使用StringBuffer(多附加,线程安全)和StringBuilder(多附加,非线程安全)。

What's the advice on using String.format()? Is it efficient, or are we forced to stick with concatenation for one-liners where performance is important?

使用上有什么建议String.format()?它是有效的,还是我们被迫在性能很重要的单行程序中坚持使用连接?

e.g. ugly old style,

例如丑陋的旧风格,

String s = "What do you get if you multiply " + varSix + " by " + varNine + "?";

vs. tidy new style (String.format, which is possibly slower),

与整洁的新样式(String.format,可能更慢),

String s = String.format("What do you get if you multiply %d by %d?", varSix, varNine);

Note: my specific use case is the hundreds of 'one-liner' log strings throughout my code. They don't involve a loop, so StringBuilderis too heavyweight. I'm interested in String.format()specifically.

注意:我的特定用例是整个代码中的数百个“单行”日志字符串。它们不涉及循环,所以StringBuilder太重量级了。我String.format()特别感兴趣。

采纳答案by hhafez

I wrote a small class to test which has the better performance of the two and + comes ahead of format. by a factor of 5 to 6. Try it your self

我写了一个小类来测试哪个具有更好的两个性能,并且 + 领先于格式。乘以 5 到 6 倍。自己尝试一下

import java.io.*;
import java.util.Date;

public class StringTest{

    public static void main( String[] args ){
    int i = 0;
    long prev_time = System.currentTimeMillis();
    long time;

    for( i = 0; i< 100000; i++){
        String s = "Blah" + i + "Blah";
    }
    time = System.currentTimeMillis() - prev_time;

    System.out.println("Time after for loop " + time);

    prev_time = System.currentTimeMillis();
    for( i = 0; i<100000; i++){
        String s = String.format("Blah %d Blah", i);
    }
    time = System.currentTimeMillis() - prev_time;
    System.out.println("Time after for loop " + time);

    }
}

Running the above for different N shows that both behave linearly, but String.formatis 5-30 times slower.

对不同的 N 运行上述结果表明两者都是线性的,但String.format慢了 5-30 倍。

The reason is that in the current implementation String.formatfirst parses the input with regular expressions and then fills in the parameters. Concatenation with plus, on the other hand, gets optimized by javac (not by the JIT) and uses StringBuilder.appenddirectly.

原因是在目前的实现中是String.format先用正则表达式解析输入,然后填入参数。另一方面,与 plus 的连接由 javac(而不是 JIT)优化并StringBuilder.append直接使用。

Runtime comparison

运行时比较

回答by Orion Adrian

Generally you should use String.Format because it's relatively fast and it supports globalization (assuming you're actually trying to write something that is read by the user). It also makes it easier to globalize if you're trying to translate one string versus 3 or more per statement (especially for languages that have drastically different grammatical structures).

通常,您应该使用 String.Format,因为它相对较快,并且支持全球化(假设您实际上是在尝试编写可由用户读取的内容)。如果您尝试翻译一个字符串,而不是每个语句翻译 3 个或更多字符串(特别是对于语法结构截然不同的语言),这也使全球化变得更容易。

Now if you never plan on translating anything, then either rely on Java's built in conversion of + operators into StringBuilder. Or use Java's StringBuilderexplicitly.

现在,如果您从不打算翻译任何内容,那么要么依赖 Java 内置的将 + 运算符转换为StringBuilder. 或者StringBuilder明确地使用Java 。

回答by Yes - that Jake.

The answer to this depends very much on how your specific Java compiler optimizes the bytecode it generates. Strings are immutable and, theoretically, each "+" operation can create a new one. But, your compiler almost certainly optimizes away interim steps in building long strings. It's entirely possible that both lines of code above generate the exact same bytecode.

这个问题的答案很大程度上取决于您的特定 Java 编译器如何优化它生成的字节码。字符串是不可变的,理论上,每个“+”操作都可以创建一个新的。但是,您的编译器几乎肯定会优化掉构建长字符串的临时步骤。上面的两行代码完全有可能生成完全相同的字节码。

The only real way to know is to test the code iteratively in your current environment. Write a QD app that concatenates strings both ways iteratively and see how they time out against each other.

唯一真正知道的方法是在当前环境中迭代测试代码。编写一个 QD 应用程序,以迭代方式连接字符串并查看它们如何相互超时。

回答by cletus

In your example, performance probalby isn't too different but there are other issues to consider: namely memory fragmentation. Even concatenate operation is creating a new string, even if its temporary (it takes time to GC it and it's more work). String.format() is just more readable and it involves less fragmentation.

在您的示例中,性能概率并没有太大不同,但还有其他问题需要考虑:即内存碎片。即使连接操作也在创建一个新字符串,即使它是临时的(GC 需要时间,而且工作量更大)。String.format() 只是更具可读性,它涉及更少的碎片。

Also, if you're using a particular format a lot, don't forget you can use the Formatter() class directly (all String.format() does is instantiate a one use Formatter instance).

此外,如果您经常使用特定格式,请不要忘记您可以直接使用 Formatter() 类(所有 String.format() 所做的都是实例化一个使用 Formatter 实例)。

Also, something else you should be aware of: be careful of using substring(). For example:

此外,您还应该注意其他事项:小心使用 substring()。例如:

String getSmallString() {
  String largeString = // load from file; say 2M in size
  return largeString.substring(100, 300);
}

That large string is still in memory because that's just how Java substrings work. A better version is:

那个大字符串仍在内存中,因为这就是 Java 子字符串的工作方式。更好的版本是:

  return new String(largeString.substring(100, 300));

or

或者

  return String.format("%s", largeString.substring(100, 300));

The second form is probably more useful if you're doing other stuff at the same time.

如果您同时做其他事情,第二种形式可能更有用。

回答by the.duckman

I just modified hhafez's test to include StringBuilder. StringBuilder is 33 times faster than String.format using jdk 1.6.0_10 client on XP. Using the -server switch lowers the factor to 20.

我刚刚修改了 hhafez 的测试以包含 StringBuilder。在 XP 上使用 jdk 1.6.0_10 客户端时,StringBuilder 比 String.format 快 33 倍。使用 -server 开关将系数降低到 20。

public class StringTest {

   public static void main( String[] args ) {
      test();
      test();
   }

   private static void test() {
      int i = 0;
      long prev_time = System.currentTimeMillis();
      long time;

      for ( i = 0; i < 1000000; i++ ) {
         String s = "Blah" + i + "Blah";
      }
      time = System.currentTimeMillis() - prev_time;

      System.out.println("Time after for loop " + time);

      prev_time = System.currentTimeMillis();
      for ( i = 0; i < 1000000; i++ ) {
         String s = String.format("Blah %d Blah", i);
      }
      time = System.currentTimeMillis() - prev_time;
      System.out.println("Time after for loop " + time);

      prev_time = System.currentTimeMillis();
      for ( i = 0; i < 1000000; i++ ) {
         new StringBuilder("Blah").append(i).append("Blah");
      }
      time = System.currentTimeMillis() - prev_time;
      System.out.println("Time after for loop " + time);
   }
}

While this might sound drastic, I consider it to be relevant only in rare cases, because the absolute numbers are pretty low: 4 s for 1 million simple String.format calls is sort of ok - as long as I use them for logging or the like.

虽然这听起来可能很激烈,但我认为它只在极少数情况下有意义,因为绝对数字非常低:100 万次简单 String.format 调用 4 秒是可以的 - 只要我将它们用于日志记录或喜欢。

Update:As pointed out by sjbotha in the comments, the StringBuilder test is invalid, since it is missing a final .toString().

更新:正如 sjbotha 在评论中指出的,StringBuilder 测试无效,因为它缺少最终的.toString().

The correct speed-up factor from String.format(.)to StringBuilderis 23 on my machine (16 with the -serverswitch).

在我的机器上,从String.format(.)to 到的正确加速因子StringBuilder是 23(带有-server开关的16 )。

回答by dw.mackie

To expand/correct on the first answer above, it's not translation that String.format would help with, actually.
What String.format will help with is when you're printing a date/time (or a numeric format, etc), where there are localization(l10n) differences (ie, some countries will print 04Feb2009 and others will print Feb042009).
With translation, you're just talking about moving any externalizable strings (like error messages and what-not) into a property bundle so that you can use the right bundle for the right language, using ResourceBundle and MessageFormat.

Looking at all the above, I'd say that performance-wise, String.format vs. plain concatenation comes down to what you prefer. If you prefer looking at calls to .format over concatenation, then by all means, go with that.
After all, code is read a lot more than it's written.

为了扩展/纠正上面的第一个答案,实际上 String.format 不会帮助翻译。
String.format 将帮助您打印日期/时间(或数字格式等),其中存在本地化(l10n)差异(即,某些国家/地区将打印 04Feb2009,而其他国家/地区将打印 Feb042009)。
对于翻译,您只是在谈论将任何可外部化的字符串(如错误消息和其他内容)移动到属性包中,以便您可以使用 ResourceBundle 和 MessageFormat 为正确的语言使用正确的包。

综观上述所有内容,我会说在性能方面,String.format 与普通连接归结为您喜欢的内容。如果您更喜欢查看对 .format 的调用而不是连接,那么请务必使用它。
毕竟,代码的阅读量远大于编写量。

回答by Itamar

I took hhafezcode and added a memory test:

我拿了hhafez代码并添加了一个内存测试

private static void test() {
    Runtime runtime = Runtime.getRuntime();
    long memory;
    ...
    memory = runtime.freeMemory();
    // for loop code
    memory = memory-runtime.freeMemory();

I run this separately for each approach, the '+' operator, String.format and StringBuilder (calling toString()), so the memory used will not be affected by other approaches. I added more concatenations, making the string as "Blah" + i + "Blah"+ i +"Blah" + i + "Blah".

我为每种方法分别运行它,'+' 运算符、String.format 和 StringBuilder(调用 toString()),因此使用的内存不会受到其他方法的影响。我添加了更多的连接,使字符串为“Blah”+ i +“Blah”+ i +“Blah”+ i +“Blah”。

The result are as follow (average of 5 runs each):
Approach       Time(ms)  Memory allocated (long)
'+' operator     747           320,504
String.format  16484       373,312
StringBuilder  769           57,344

结果如下(每次平均运行 5 次):
Approach Time(ms) 分配的内存(长)
'+' operator 747 320,504
String.format 16484 373,312
StringBuilder 769 57,344

We can see that String '+' and StringBuilder are practically identical time-wise, but StringBuilder is much more efficient in memory use. This is very important when we have many log calls (or any other statements involving strings) in a time interval short enough so the Garbage Collector won't get to clean the many string instances resulting of the '+' operator.

我们可以看到 String '+' 和 StringBuilder 在时间上实际上是相同的,但 StringBuilder 在内存使用方面效率更高。当我们在足够短的时间间隔内有许多日志调用(或任何其他涉及字符串的语句)时,这一点非常重要,这样垃圾收集器就无法清理由“+”运算符产生的许多字符串实例。

And a note, BTW, don't forget to check the logging levelbefore constructing the message.

顺便提一下,在构造消息之前,请不要忘记检查日志记录级别

Conclusions:

结论:

  1. I'll keep on using StringBuilder.
  2. I have too much time or too little life.
  1. 我会继续使用 StringBuilder。
  2. 我有太多的时间或太少的生活。

回答by Rapha?l

Your old ugly style is automatically compiled by JAVAC 1.6 as :

JAVAC 1.6 会自动将您旧的丑陋样式编译为:

StringBuilder sb = new StringBuilder("What do you get if you multiply ");
sb.append(varSix);
sb.append(" by ");
sb.append(varNine);
sb.append("?");
String s =  sb.toString();

So there is absolutely no difference between this and using a StringBuilder.

所以这和使用 StringBuilder 完全没有区别。

String.format is a lot more heavyweight since it creates a new Formatter, parses your input format string, creates a StringBuilder, append everything to it and calls toString().

String.format 更重要,因为它创建了一个新的 Formatter,解析您的输入格式字符串,创建一个 StringBuilder,将所有内容附加到它并调用 toString()。

回答by Dustin Getz

Java's String.format works like so:

Java 的 String.format 是这样工作的:

  1. it parses the format string, exploding into a list of format chunks
  2. it iterates the format chunks, rendering into a StringBuilder, which is basically an array that resizes itself as necessary, by copying into a new array. this is necessary because we don't yet know how large to allocate the final String
  3. StringBuilder.toString() copies his internal buffer into a new String
  1. 它解析格式字符串,分解为格式块列表
  2. 它迭代格式块,渲染成一个 StringBuilder,它基本上是一个数组,可以根据需要调整自身大小,通过复制到一个新数组中。这是必要的,因为我们还不知道分配最终的 String 有多大
  3. StringBuilder.toString() 将他的内部缓冲区复制到一个新的 String 中

if the final destination for this data is a stream (e.g. rendering a webpage or writing to a file), you can assemble the format chunks directly into your stream:

如果此数据的最终目的地是流(例如呈现网页或写入文件),您可以将格式块直接组合到您的流中:

new PrintStream(outputStream, autoFlush, encoding).format("hello {0}", "world");

I speculate that the optimizer will optimize away the format string processing. If so, you're left with equivalent amortizedperformance to manually unrolling your String.format into a StringBuilder.

我推测优化器将优化掉格式字符串处理。如果是这样,您将获得与手动展开 String.format 到 StringBuilder 的等效摊销性能。

回答by ANON

Here is modified version of hhafez entry. It includes a string builder option.

这是 hhafez 条目的修改版本。它包括一个字符串生成器选项。

public class BLA
{
public static final String BLAH = "Blah ";
public static final String BLAH2 = " Blah";
public static final String BLAH3 = "Blah %d Blah";


public static void main(String[] args) {
    int i = 0;
    long prev_time = System.currentTimeMillis();
    long time;
    int numLoops = 1000000;

    for( i = 0; i< numLoops; i++){
        String s = BLAH + i + BLAH2;
    }
    time = System.currentTimeMillis() - prev_time;

    System.out.println("Time after for loop " + time);

    prev_time = System.currentTimeMillis();
    for( i = 0; i<numLoops; i++){
        String s = String.format(BLAH3, i);
    }
    time = System.currentTimeMillis() - prev_time;
    System.out.println("Time after for loop " + time);

    prev_time = System.currentTimeMillis();
    for( i = 0; i<numLoops; i++){
        StringBuilder sb = new StringBuilder();
        sb.append(BLAH);
        sb.append(i);
        sb.append(BLAH2);
        String s = sb.toString();
    }
    time = System.currentTimeMillis() - prev_time;
    System.out.println("Time after for loop " + time);

}

}

}

Time after for loop 391 Time after for loop 4163 Time after for loop 227

for 循环后的时间 391 for 循环后的时间 4163 for 循环后的时间 227