Java 在循环中重用 StringBuilder 会更好吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/242438/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is it better to reuse a StringBuilder in a loop?
提问by Pier Luigi
I've a performance related question regarding use of StringBuilder.
In a very long loop I'm manipulating a StringBuilder
and passing it to another method like this:
我有一个关于使用 StringBuilder 的性能相关问题。在一个很长的循环中,我正在操作 aStringBuilder
并将其传递给另一个方法,如下所示:
for (loop condition) {
StringBuilder sb = new StringBuilder();
sb.append("some string");
. . .
sb.append(anotherString);
. . .
passToMethod(sb.toString());
}
Is instantiating StringBuilder
at every loop cycle is a good solution? And is calling a delete instead better, like the following?
StringBuilder
在每个循环周期实例化是一个很好的解决方案吗?调用删除是否更好,如下所示?
StringBuilder sb = new StringBuilder();
for (loop condition) {
sb.delete(0, sb.length);
sb.append("some string");
. . .
sb.append(anotherString);
. . .
passToMethod(sb.toString());
}
回答by Stu Thompson
The modern JVM is really smart about stuff like this. I would not second guess it and do something hacky that is less maintainable/readable...unless you do proper bench marks with production data that validate a non-trivial performance improvement (and document it ;)
现代 JVM 对这样的东西非常聪明。我不会再猜测它并做一些不太容易维护/可读的事情......除非您使用生产数据进行适当的基准测试以验证非平凡的性能改进(并记录它;)
回答by Peter
In the philosophy of writing solid code its always better to put your StringBuilder inside your loop. This way it doesnt go outside the code its intended for.
在编写可靠代码的哲学中,将 StringBuilder 放在循环中总是更好。这样它就不会超出其预期的代码。
Secondly the biggest improvment in StringBuilder comes from giving it an initial size to avoid it growing bigger while the loop runs
其次,StringBuilder 最大的改进来自于给它一个初始大小以避免它在循环运行时变大
for (loop condition) {
StringBuilder sb = new StringBuilder(4096);
}
回答by cfeduke
Based on my experience with developing software on Windows I would say clearing the StringBuilder out during your loop has better performance than instantiating a StringBuilder with each iteration. Clearing it frees that memory to be overwritten immediately with no additional allocation required. I'm not familiar enough with the Java garbage collector, but I would think that freeing and no reallocation (unless your next string grows the StringBuilder) is more beneficial than instantiation.
根据我在 Windows 上开发软件的经验,我会说在循环期间清除 StringBuilder 比在每次迭代中实例化 StringBuilder 具有更好的性能。清除它可以释放该内存以立即覆盖,而无需额外分配。我对 Java 垃圾收集器不够熟悉,但我认为释放和不重新分配(除非您的下一个字符串增长 StringBuilder)比实例化更有益。
(My opinion is contrary to what everyone else is suggesting. Hmm. Time to benchmark it.)
(我的意见与其他人的建议相反。嗯。是时候进行基准测试了。)
回答by Epaga
The second one is about 25% faster in my mini-benchmark.
在我的迷你基准测试中,第二个大约快 25%。
public class ScratchPad {
static String a;
public static void main( String[] args ) throws Exception {
long time = System.currentTimeMillis();
for( int i = 0; i < 10000000; i++ ) {
StringBuilder sb = new StringBuilder();
sb.append( "someString" );
sb.append( "someString2"+i );
sb.append( "someStrin4g"+i );
sb.append( "someStr5ing"+i );
sb.append( "someSt7ring"+i );
a = sb.toString();
}
System.out.println( System.currentTimeMillis()-time );
time = System.currentTimeMillis();
StringBuilder sb = new StringBuilder();
for( int i = 0; i < 10000000; i++ ) {
sb.delete( 0, sb.length() );
sb.append( "someString" );
sb.append( "someString2"+i );
sb.append( "someStrin4g"+i );
sb.append( "someStr5ing"+i );
sb.append( "someSt7ring"+i );
a = sb.toString();
}
System.out.println( System.currentTimeMillis()-time );
}
}
Results:
结果:
25265
17969
Note that this is with JRE 1.6.0_07.
请注意,这是 JRE 1.6.0_07。
Based on Jon Skeet's ideas in the edit, here's version 2. Same results though.
基于 Jon Skeet 在编辑中的想法,这里是第 2 版。不过结果相同。
public class ScratchPad {
static String a;
public static void main( String[] args ) throws Exception {
long time = System.currentTimeMillis();
StringBuilder sb = new StringBuilder();
for( int i = 0; i < 10000000; i++ ) {
sb.delete( 0, sb.length() );
sb.append( "someString" );
sb.append( "someString2" );
sb.append( "someStrin4g" );
sb.append( "someStr5ing" );
sb.append( "someSt7ring" );
a = sb.toString();
}
System.out.println( System.currentTimeMillis()-time );
time = System.currentTimeMillis();
for( int i = 0; i < 10000000; i++ ) {
StringBuilder sb2 = new StringBuilder();
sb2.append( "someString" );
sb2.append( "someString2" );
sb2.append( "someStrin4g" );
sb2.append( "someStr5ing" );
sb2.append( "someSt7ring" );
a = sb2.toString();
}
System.out.println( System.currentTimeMillis()-time );
}
}
Results:
结果:
5016
7516
回答by Jon Skeet
Okay, I now understand what's going on, and it does make sense.
好的,我现在明白发生了什么,而且确实有道理。
I was under the impression that toString
just passed the underlying char[]
into a String constructor which didn'ttake a copy. A copy would then be made on the next "write" operation (e.g. delete
). I believe this wasthe case with StringBuffer
in some previous version. (It isn't now.) But no - toString
just passes the array (and index and length) to the public String
constructor which takes a copy.
我的印象是,toString
只是将底层传递给char[]
一个没有复制副本的 String 构造函数。然后将在下一个“写入”操作(例如delete
)中进行复制。我相信在以前的版本中就是这种情况StringBuffer
。(现在不是。)但是没有 -toString
只是将数组(以及索引和长度)传递给String
需要副本的公共构造函数。
So in the "reuse the StringBuilder
" case we genuinely create one copy of the data per string, using the same char array in the buffer the whole time. Obviously creating a new StringBuilder
each time creates a new underlying buffer - and then that buffer is copied (somewhat pointlessly, in our particular case, but done for safety reasons) when creating a new string.
因此,在“重用StringBuilder
”的情况下,我们真正为每个字符串创建了一份数据副本,始终在缓冲区中使用相同的字符数组。显然,StringBuilder
每次创建一个 new都会创建一个新的底层缓冲区 - 然后在创建新字符串时复制该缓冲区(在我们的特定情况下,有些毫无意义,但出于安全原因而这样做)。
All this leads to the second version definitely being more efficient - but at the same time I'd still say it's uglier code.
所有这些都导致第二个版本的效率更高——但同时我仍然会说它是更丑陋的代码。
回答by Jon Skeet
Declare once, and assign each time. It is a more pragmatic and reusable concept than an optimization.
声明一次,每次赋值。它是一个比优化更实用和可重用的概念。
回答by dongilmore
The first is better for humans. If the second is a bit faster on some versions of some JVMs, so what?
第一个对人类更好。如果第二个在某些 JVM 的某些版本上快一点,那又怎样?
If performance is that critical, bypass StringBuilder and write your own. If you're a good programmer, and take into account how your app is using this function, you should be able to make it even faster. Worthwhile? Probably not.
如果性能如此关键,请绕过 StringBuilder 并编写自己的。如果您是一名优秀的程序员,并且考虑到您的应用程序如何使用此功能,您应该能够使其更快。值得吗?可能不是。
Why is this question stared as "favorite question"? Because performance optimization is so much fun, no matter whether it is practical or not.
为什么这个问题被视为“最喜欢的问题”?因为性能优化实在是太好玩了,不管实用与否。
回答by Hyman Leow
Since I don't think it's been pointed out yet, because of optimizations built into the Sun Java compiler, which automatically creates StringBuilders (StringBuffers pre-J2SE 5.0) when it sees String concatenations, the first example in the question is equivalent to:
由于我认为还没有指出它,因为 Sun Java 编译器内置了优化,当它看到字符串连接时会自动创建 StringBuilders(J2SE 5.0 之前的 StringBuffers),问题中的第一个示例相当于:
for (loop condition) {
String s = "some string";
. . .
s += anotherString;
. . .
passToMethod(s);
}
Which is more readable, IMO, the better approach. Your attempts to optimize may result in gains in some platform, but potentially losses others.
哪个更具可读性,IMO,更好的方法。您的优化尝试可能会在某些平台上获得收益,但可能会损失其他平台。
But if you really are running into issues with performance, then sure, optimize away. I'd start with explicitly specifying the buffer size of the StringBuilder though, per Jon Skeet.
但是,如果您确实遇到了性能问题,那么当然可以进行优化。根据 Jon Skeet,我首先明确指定 StringBuilder 的缓冲区大小。
回答by Dave Jarvis
Faster still:
更快:
public class ScratchPad {
private static String a;
public static void main( String[] args ) throws Exception {
long time = System.currentTimeMillis();
StringBuilder sb = new StringBuilder( 128 );
for( int i = 0; i < 10000000; i++ ) {
// Resetting the string is faster than creating a new object.
// Since this is a critical loop, every instruction counts.
//
sb.setLength( 0 );
sb.append( "someString" );
sb.append( "someString2" );
sb.append( "someStrin4g" );
sb.append( "someStr5ing" );
sb.append( "someSt7ring" );
setA( sb.toString() );
}
System.out.println( System.currentTimeMillis()-time );
}
private static void setA( String aString ) {
a = aString;
}
}
In the philosophy of writing solid code, the inner workings of the method should be hidden from the objects that use the method. Thus it makes no difference from the system's perspective whether you redeclare the StringBuilder within the loop or outside of the loop. Since declaring it outside of the loop is faster, and it does not make the code more complicated to read, then reuse the object rather than reinstantiate it.
在编写可靠代码的哲学中,方法的内部工作应该对使用该方法的对象隐藏。因此,无论您是在循环内还是在循环外重新声明 StringBuilder,从系统的角度来看都没有区别。由于在循环外声明它更快,并且不会使代码更复杂,因此可以重用该对象而不是重新实例化它。
Even if the code was more complicated, and you knew for certain that object instantiation was the bottleneck, comment it.
即使代码更复杂,并且您肯定知道对象实例化是瓶颈,请对其进行注释。
Three runs with this answer:
三个运行这个答案:
$ java ScratchPad
1567
$ java ScratchPad
1569
$ java ScratchPad
1570
Three runs with the other answer:
三运行与另一个答案:
$ java ScratchPad2
1663
2231
$ java ScratchPad2
1656
2233
$ java ScratchPad2
1658
2242
Although not significant, setting the StringBuilder
's initial buffer size will give a small gain.
虽然不重要,但设置StringBuilder
的初始缓冲区大小会带来一些好处。
回答by brianegge
The reason why doing a 'setLength' or 'delete' improves the performance is mostly the code 'learning' the right size of the buffer, and less to do the memory allocation. Generally, I recommend letting the compiler do the string optimizations. However, if the performance is critical, I'll often pre-calculate the expected size of the buffer. The default StringBuilder size is 16 characters. If you grow beyond that, then it has to resize. Resizing is where the performance is getting lost. Here's another mini-benchmark which illustrates this:
执行“setLength”或“delete”提高性能的原因主要是代码“学习”了缓冲区的正确大小,而不是进行内存分配。通常,我建议让编译器进行字符串优化。但是,如果性能至关重要,我通常会预先计算缓冲区的预期大小。默认的 StringBuilder 大小为 16 个字符。如果你超出了这个范围,那么它必须调整大小。调整大小是性能丢失的地方。这是另一个说明这一点的迷你基准:
private void clear() throws Exception {
long time = System.currentTimeMillis();
int maxLength = 0;
StringBuilder sb = new StringBuilder();
for( int i = 0; i < 10000000; i++ ) {
// Resetting the string is faster than creating a new object.
// Since this is a critical loop, every instruction counts.
//
sb.setLength( 0 );
sb.append( "someString" );
sb.append( "someString2" ).append( i );
sb.append( "someStrin4g" ).append( i );
sb.append( "someStr5ing" ).append( i );
sb.append( "someSt7ring" ).append( i );
maxLength = Math.max(maxLength, sb.toString().length());
}
System.out.println(maxLength);
System.out.println("Clear buffer: " + (System.currentTimeMillis()-time) );
}
private void preAllocate() throws Exception {
long time = System.currentTimeMillis();
int maxLength = 0;
for( int i = 0; i < 10000000; i++ ) {
StringBuilder sb = new StringBuilder(82);
sb.append( "someString" );
sb.append( "someString2" ).append( i );
sb.append( "someStrin4g" ).append( i );
sb.append( "someStr5ing" ).append( i );
sb.append( "someSt7ring" ).append( i );
maxLength = Math.max(maxLength, sb.toString().length());
}
System.out.println(maxLength);
System.out.println("Pre allocate: " + (System.currentTimeMillis()-time) );
}
public void testBoth() throws Exception {
for(int i = 0; i < 5; i++) {
clear();
preAllocate();
}
}
The results show reusing the object is about 10% faster than creating a buffer of the expected size.
结果表明,重用对象比创建预期大小的缓冲区快 10%。