java 这段简单的代码有多复杂?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7156122/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is the complexity of this simple piece of code?
提问by Someone
I'm pasting this text from an ebook I have. It says the complexity if O(n2) and also gives an explanation for it, but I fail to see how.
我正在从我拥有的电子书中粘贴这段文字。它说明了 O(n 2)的复杂性,并对其进行了解释,但我看不出如何。
Question: What is the running time of this code?
问题:这段代码的运行时间是多少?
public String makeSentence(String[] words) {
StringBuffer sentence = new StringBuffer();
for (String w : words) sentence.append(w);
return sentence.toString();
}
The answer the book gave:
书中给出的答案:
O(n2), where n is the number of letters in sentence. Here's why: each time you append a string to sentence, you create a copy of sentence and run through all the letters in sentence to copy them over If you have to iterate through up to n characters each time in the loop, and you're looping at least n times, that gives you an O(n2) run time. Ouch!
O(n 2),其中 n 是句子中的字母数。原因如下:每次将字符串附加到句子时,您都会创建一个句子的副本并遍历句子中的所有字母以将它们复制过去如果您必须在循环中每次最多迭代 n 个字符,并且您循环至少 n 次,这给你一个 O(n 2) 运行时间。哎哟!
Can someone please explain this answer more clearly?
有人可以更清楚地解释这个答案吗?
回答by Han Zhou
This seems to be a question of mislead, because I happened to read that book just now. This part of text in the book is a typo! Here is the context:
这似乎是一个误导的问题,因为我刚好刚刚看过那本书。书中这部分文字是错别字!这是上下文:
===================================================================
================================================== ==================
Question: What is the running time of this code?
问题:这段代码的运行时间是多少?
1 public String makeSentence(String[] words) {
2 StringBuffer sentence = new StringBuffer();
3 for (String w : words) sentence.append(w);
4 return sentence.toString();
5 }
Answer: O(n2), where n is the number of letters in sentence. Here's why: each time you append a string to sentence, you create a copy of sentence and run through all the letters in sentence to copy them over. If you have to iterate through up to n characters each time in the loop, and you're looping at least n times, that gives you an O(n2) run time. Ouch! With StringBuffer (or StringBuilder) can help you avoid this problem.
答案:O(n 2),其中 n 是句子中的字母数。原因如下:每次将字符串附加到句子时,您都会创建一个句子的副本并遍历句子中的所有字母以将它们复制过来。如果您必须在循环中每次最多迭代 n 个字符,并且您至少循环了 n 次,那么您的运行时间为 O(n 2)。哎哟! 使用StringBuffer(或StringBuilder)可以帮助你避免这个问题。
1 public String makeSentence(String[] words) {
2 StringBuffer sentence = new StringBuffer();
3 for (String w : words) sentence.append(w);
4 return sentence.toString();
5 }
=====================================================================
================================================== ====================
Have you noticed that the author messed it up? The O(n2) solution she mentioned (the first one) was exactly the same as the 'optimized' one (the latter). So, my conclusion is that the author was trying to render something else, such as always copying the old sentence to a new buffer when appending every next string, as the example of an O(n2) algorithm. StringBuffer should not be so silly, as the author also mentioned 'With StringBuffer (or StringBuilder) can help you avoid this problem'.
你有没有注意到作者把它搞砸了?她提到的 O(n 2) 解决方案(第一个)与“优化”的解决方案(后者)完全相同。所以,我的结论是作者试图呈现其他东西,例如在附加每个下一个字符串时总是将旧句子复制到新缓冲区,作为 O(n 2) 算法的示例。StringBuffer 不应该这么傻,因为作者还提到'With StringBuffer(或StringBuilder)可以帮助你避免这个问题'。
回答by Brian L
It's a bit difficult to answer a question about the complexity of this code when it is written at a high level which abstracts away the details of the implementation. The Java documentationdoesn't seem to give any guarantees in terms of the complexity of the append
function. As others have pointed out, the StringBuffer
class can (and should) be written so that the complexity of appending strings does not depend on the current length of the string held in StringBuffer
.
当在抽象了实现细节的高层编写时,回答有关此代码复杂性的问题有点困难。在Java文档似乎并没有给予任何保证在中的复杂性方面append
的功能。正如其他人指出的那样,StringBuffer
可以(并且应该)编写该类,以便附加字符串的复杂性不依赖于StringBuffer
.
However, I suspect it is not that helpful to the person asking this question to simply say "your book is wrong!" - instead, let us see what assumptions are being made and make clear what the author was trying to say.
然而,我怀疑问这个问题的人简单地说“你的书是错的!”并没有多大帮助。- 相反,让我们看看正在做出哪些假设并弄清楚作者想说什么。
You can make the following assumptions:
您可以做出以下假设:
- Creating a
new StringBuffer
is O(1) - Getting the next string
w
inwords
is O(1) - Returning
sentence.toString
is at most O(n).
- 创建一个
new StringBuffer
是 O(1) - 获取下一个字符串
w
在words
为O(1) - 返回
sentence.toString
最多为 O(n)。
The question is really what is the order of sentence.append(w)
, and that depends on how it happens inside the StringBuffer
. The naive way is to do it like Shlemiel the Painter.
问题实际上是 的顺序是什么sentence.append(w)
,这取决于它在StringBuffer
. 天真的方法是像画家什莱米尔那样做。
The silly way
愚蠢的方式
Suppose you use a C-style null-terminated string for the contents of StringBuffer
. The way you find the end of such a string is by reading each character, one by one, until you find the null character - then to append a new string S, you can start copying characters from S to the StringBuffer
string (finishing with another null character). If you write append
this way, it is O(a+ b), where ais the number of characters currently in the StringBuffer
, and bis the number of characters in the new word. If you loop over an array of words, and each time you have to read all the characters you just appended before appending the new word, then the complexity of the loop is O(n^2), where n is the total number of characters in all the words (also the number of characters in the final sentence).
假设您对 的内容使用 C 风格的以空字符结尾的字符串StringBuffer
。找到这样一个字符串结尾的方法是通过一个一个地读取每个字符,直到找到空字符 - 然后附加一个新字符串 S,您可以开始将字符从 S 复制到StringBuffer
字符串(以另一个空字符结束)特点)。如果append
这样写,就是 O( a+ b),其中a是当前在 中的字符数StringBuffer
,而b是新单词中的字符数。如果循环遍历一个单词数组,并且每次必须在追加新单词之前读取刚刚追加的所有字符,那么循环的复杂度为 O(n^2),其中 n 是字符总数在所有单词中(还有最后一句话中的字符数)。
A better way
更好的方法
On the other hand, suppose that the contents of StringBuffer
is still an array of characters, but we also store an integer size
which tells us how long the string is (number of characters). Now we no longer have to read every character in the StringBuffer
in order to find the end of the string; we can just look up index size
in the array, which is O(1) instead of O(a). Then the append
function now only depends on the number of characters being appended, O(b). In this case the complexity of the loop is O(n), where n is the total number of characters in all the words.
另一方面,假设 的内容StringBuffer
仍然是一个字符数组,但我们还存储了一个整数size
,它告诉我们字符串的长度(字符数)。现在我们不再需要读取 中的每个字符StringBuffer
才能找到字符串的结尾;我们可以size
在数组中查找索引,它是 O(1) 而不是 O( a)。那么append
函数现在只取决于被附加的字符数,O( b)。在这种情况下,循环的复杂度为 O(n),其中 n 是所有单词中的字符总数。
...We're not done yet!
......我们还没有完成!
Finally, there's one more aspect of the implementation that hasn't been covered yet, and that is the one actually brought up by the answer in the textbook - memory allocation. Each time you want to write more characters to your StringBuffer
, you're not guaranteed to have enough space in your character array to actually fit the new word in. If there isn't enough space, your computer needs to first allocate some more room in a clean section of memory, and then copy all the information in the old StringBuffer
array across, and then it can continue as before. Copying data like this will take O(a) time (where ais the number of characters to be copied).
最后,该实现还有一个方面尚未涉及,这就是教科书答案中实际提出的一个方面 - 内存分配。每次您想向StringBuffer
.一块干净的内存,然后将旧StringBuffer
数组中的所有信息复制过来,然后就可以像以前一样继续了。像这样复制数据将花费 O( a) 时间(其中a是要复制的字符数)。
In the worst case, you have to allocate more memory everytime you add a new word. This basically takes us back to square one where the loop has O(n^2) complexity, and is what the book seems to suggest. If you assume that nothing crazy is happening (the words aren't getting longer at an exponential rate!), then you can probably reduce the number of memory allocations to something more like O(log(n)) by having the allocated memory grow exponentially. If that's the number of memory allocations, and memory allocations in general are O(a), then the total complexity attributed just to memory management in the loop is O(n log(n)). Since the appending work is O(n) and less than the complexity of the memory management, the total complexity of the function is O(n log(n)).
在最坏的情况下,每次添加新单词时都必须分配更多内存。这基本上将我们带回到循环具有 O(n^2) 复杂性的地方,这就是这本书的建议。如果您假设没有发生任何疯狂的事情(单词不会以指数速度变长!),那么您可以通过让分配的内存增长来将内存分配的数量减少到更像 O(log(n)) 的数量呈指数级增长。如果这是内存分配的数量,一般的内存分配是 O( a),那么仅归因于循环中内存管理的总复杂度为 O(n log(n))。由于追加工作为 O(n) 且小于内存管理的复杂度,因此函数的总复杂度为 O(n log(n))。
Again, the Java documentation doesn't help us in terms of how the capacity of the StringBuffer
grows, it just says "If the internal buffer overflows, it is automatically made larger". Depending on how it happens, you could end up with either O(n^2) or O(n log(n)) overall.
同样,Java 文档在容量如何StringBuffer
增长方面没有帮助我们,它只是说“如果内部缓冲区溢出,它会自动变大”。根据它的发生方式,您最终可能会得到 O(n^2) 或 O(n log(n)) 的总体结果。
As an exercise left to the reader: Find an easy way to modify the function so that the overall complexity is O(n), by removing memory reallocation issues.
作为留给读者的练习:找到一种简单的方法来修改函数,通过消除内存重新分配问题,使整体复杂度为 O(n)。
回答by porges
The accepted answer is just wrong. StringBuffer
has amortized O(1) append, so nappends will be O(n).
接受的答案是错误的。StringBuffer
已摊销 O(1) 追加,因此n追加将是 O( n)。
If it wasn't O(1) append, StringBuffer
would have no reason to exist, since writing that loop with plain String
concatenation would be O(n^2) as well!
如果不是 O(1) 追加,StringBuffer
则没有理由存在,因为用普通String
串联编写该循环也是 O( n^2) !
回答by Tyler.Zhang
I think these text in the book must be a typo ,I think the right content is below,I fix it:
我认为书中的这些文字一定是错字,我认为正确的内容在下面,我改正了:
===================================================================
================================================== ==================
Question: What is the running time of this code?
问题:这段代码的运行时间是多少?
public String makeSentence(String[] words) {
String sentence = new String("");
for (String w : words) sentence+=W;
return sentence;
}
Answer: O(n2), where n is the number of letters in sentence. Here's why: each time you append a string to sentence, you create a copy of sentence and run through all the letters in sentence to copy them over. If you have to iterate through up to n characters each time in the loop, and you're looping at least n times, that gives you an O(n2) run time. Ouch! With StringBuffer (or StringBuilder) can help you avoid this problem.
答案:O(n 2),其中 n 是句子中的字母数。原因如下:每次将字符串附加到句子时,您都会创建一个句子的副本并遍历句子中的所有字母以将它们复制过来。如果您必须在循环中每次最多迭代 n 个字符,并且您至少循环了 n 次,那么您的运行时间为 O(n 2)。哎哟! 使用StringBuffer(或StringBuilder)可以帮助你避免这个问题。
public String makeSentence(String[] words) {
StringBuffer sentence = new StringBuffer();
for (String w : words) sentence.append(w);
return sentence.toString();
}
=====================================================================
================================================== ====================
Am i right?
我对吗?
回答by Mikita Belahlazau
I tried to check it using this program
我试着用这个程序检查它
public class Test {
private static String[] create(int n) {
String[] res = new String[n];
for (int i = 0; i < n; i++) {
res[i] = "abcdefghijklmnopqrst";
}
return res;
}
private static String makeSentence(String[] words) {
StringBuffer sentence = new StringBuffer();
for (String w : words) sentence.append(w);
return sentence.toString();
}
public static void main(String[] args) {
String[] ar = create(Integer.parseInt(args[0]));
long begin = System.currentTimeMillis();
String res = makeSentence(ar);
System.out.println(System.currentTimeMillis() - begin);
}
}
And result was, as expected, O(n):
结果是,正如预期的那样,O(n):
java Test 200000 - 128 ms
java 测试 200000 - 128 毫秒
java Test 500000 - 370 ms
java 测试 500000 - 370 毫秒
java Test 1000000 - 698 ms
java 测试 1000000 - 698 毫秒
Version 1.6.0.21
版本 1.6.0.21
回答by Dimos
There's a typo in this book.
这本书有错别字。
1st case:
第一种情况:
public String makeSentence(String[] words) {
String sentence = new String();
for (String w : words) sentence += w;
return sentence;
}
Complexity : O(n^2) -> (n words) x (n characters copied at each iteration, for copying the current sentence into a StringBuffer)
复杂度:O(n^2) -> (n words) x(每次迭代复制n个字符,用于将当前句子复制到StringBuffer中)
2nd case:
第二种情况:
public String makeSentence(String[] words) {
StringBuffer sentence = new StringBuffer();
for (String w : words) sentence.append(w);
return sentence.toString();
}
Complexity : O(n) -> (n words) x O(1) (amortized complexity for StringBuffer concatenation)
复杂度:O(n) -> (n words) x O(1)(StringBuffer 连接的分摊复杂度)
回答by Stefan Kendall
That really depends on the implementation of StringBuffer
. Supposing .append()
was constant time, it's clear that you have an O(n)
algorithm in time where n = length of the words array
. If .append
isn'tconstant time, you'll need to multiple your O(n) by the time complexity of the method. If indeed the current implementation of StringBuffer
copies strings character-by-character, then the algorithm above is
这真的取决于StringBuffer
. 假设.append()
是常数时间,很明显你有一个O(n)
算法,其中n = length of the words array
。如果.append
不是恒定时间,则需要将 O(n) 乘以该方法的时间复杂度。如果当前的实现确实是StringBuffer
逐个字符地复制字符串,那么上面的算法是
Θ(n*m)
, or O(n*m)
, where n
is the number of words and m
is average word length, and your book is wrong. I assume you're looking for a strict bound.
Θ(n*m)
, 或O(n*m)
, 哪里n
是字数,m
是平均字长,你的书是错的。我假设您正在寻找严格的界限。
Simple example that the book's answer is incorrect:
String[] words = ['alphabet']
By the book's definition, n=8
, so the algorithm will be bounded by 64 steps. Is this the case? Clearly not strictly. I see 1 assignment and 1 copy operation with n characters, so you get about 9 steps. This sort of behavior is predicted by the bounds of O(n*m)
, as I illustrated above.
书中的答案不正确的简单例子:
String[] words = ['alphabet']
根据书中的定义,n=8
,因此该算法将受到 64 步的限制。是这种情况吗?显然不严格。我看到有 n 个字符的 1 个赋值和 1 个复制操作,因此您大约需要 9 个步骤。这种行为是由 的边界预测的O(n*m)
,正如我上面所说明的。
I did some digging, and this clearly isn't a simple character copy. It looks like memory is being copied in bulk, which puts us back at O(n)
, your first guess at the solution.
我做了一些挖掘,这显然不是一个简单的角色副本。看起来内存正在被批量复制,这让我们回到 ,这是O(n)
您对解决方案的第一个猜测。
/* StringBuffer is just a proxy */
public AbstractStringBuilder append(String str)
{
if (str == null) str = "null";
int len = str.length();
ensureCapacityInternal(count + len);
str.getChars(0, len, value, count);
count += len;
return this;
}
/* java.lang.String */
void getChars(char dst[], int dstBegin) {
System.arraycopy(value, offset, dst, dstBegin, count);
}
Your book is either old, terrible, or both. I'm not determined enough to dig through JDK versions to find a less optimal implementation of StringBuffer, but perhaps one exists.
你的书要么很旧,要么很糟糕,或者两者兼而有之。我没有足够的决心去挖掘 JDK 版本来找到一个不太理想的 StringBuffer 实现,但也许存在一个。
回答by tryit
As the explanation given in the book, for ever word in the string array a new object of sentence gets created and that sentence object first copies the previous sentence and then traverses till the end of the array and then appends the new word, hence the complexity of n^2
.
正如书中给出的解释,对于字符串数组中的任何单词,都会创建一个新的句子对象,该句子对象首先复制前一个句子,然后遍历到数组末尾,然后附加新单词,因此复杂性的n^2
。
- First 'n' to copy the previous sentence into a new object
- Second 'n' to traverse that array and then append it
- 第一个 'n' 将上一句复制到新对象中
- 第二个 'n' 遍历该数组,然后附加它
Hence n*n
will be n^2
.
因此n*n
将是n^2
。
回答by aroth
Looks like O(n) to me (with n
being the total number of letters in all the words). You're basically iterating over every character in words
to append it into the StringBuffer
.
对我来说看起来像 O(n)(n
所有单词中的字母总数)。您基本上是在迭代每个字符words
以将其附加到StringBuffer
.
The only way I could see this as being O(n^2) is if append()
iterates all the contents in the buffer before appending any new characters. And it may actually do this occasionally if the number of characters exceeds the currently allocated buffer length (it has to allocate a new buffer, and then copy everything from the current buffer into the new buffer). But it won't happen on every iteration, so you still won't have O(n^2).
我认为这是 O(n^2) 的唯一方法是append()
在追加任何新字符之前迭代缓冲区中的所有内容。如果字符数超过当前分配的缓冲区长度,它实际上可能偶尔会这样做(它必须分配一个新缓冲区,然后将所有内容从当前缓冲区复制到新缓冲区中)。但它不会在每次迭代中发生,所以你仍然不会有 O(n^2)。
At most you'd have O(m * n), where m
is the number of times the buffer length is increased. And because the StringBuffer
will double its buffer sizeevery time it allocates a larger buffer we can determine that m
is roughly equal to log2(n)
(actually log2(n) - log2(16)
, since the default initial buffer size is 16 instead of 1).
最多你有 O(m * n),其中m
是缓冲区长度增加的次数。并且因为每次分配更大的缓冲区时StringBuffer
都会将其缓冲区大小加倍,我们可以确定它m
大致等于log2(n)
(实际上log2(n) - log2(16)
,因为默认的初始缓冲区大小是 16 而不是 1)。
So the real answer is that the book's example is O(n log n), and that you can get it down to O(n) by preallocating a StringBuffer
with a capacity large enough to hold all of your letters.
所以真正的答案是这本书的例子是 O(n log n),你可以通过预先分配一个StringBuffer
足够大的容量来容纳你所有的字母来把它降低到 O(n) 。
Note that in Java appending to a string using +=
does exhibit the inefficient behavior described in the book's explanation, as it has to allocate a new string and copy all the data from both strings into it. So if you do this, it is O(n^2):
请注意,在 Java 中附加到字符串 using+=
确实表现出本书解释中描述的低效行为,因为它必须分配一个新字符串并将两个字符串中的所有数据复制到其中。所以如果你这样做,它是 O(n^2):
String sentence = "";
for (String w : words) {
sentence += w;
}
But using StringBuffer
should not generate the same behavior as in the above example. That's kind of one of the major reasons for StringBuffer
to exist in the first place.
但是 usingStringBuffer
不应产生与上述示例相同的行为。这是StringBuffer
首先存在的主要原因之一。
回答by GDubz
Here's my calculation for how they got O(n^2)
这是我对他们如何得到 O(n^2) 的计算
We'll ignore the CPU time for declaring the StringBuffer, as it doesn't vary with the size of the final string.
我们将忽略声明 StringBuffer 的 CPU 时间,因为它不随最终字符串的大小而变化。
When calculating the O complexity we are concerned with the worst case, this will occur when there are 1 letter Strings. I shall explain after this example:
在计算 O 复杂度时,我们关注最坏的情况,当有 1 个字母字符串时会发生这种情况。我将在这个例子之后解释:
Let's say we have 4 one-letter strings: 'A', 'B', 'C', 'D'.
假设我们有 4 个单字母字符串:'A'、'B'、'C'、'D'。
Read in A: CPU-time to find end of StringBuffer: 0 CPU-time to append 'A': 1
读入 A: CPU-time to find End of StringBuffer: 0 CPU-time to append 'A': 1
Read in B: CPU-time to find end of StringBuffer: 1 CPU-time to append 'B': 1
读入 B: CPU-time to find End of StringBuffer: 1 CPU-time to append 'B': 1
Read in C: CPU-time to find end of StringBuffer: 2 CPU-time to append 'C': 1
读入 C: CPU-time to find End of StringBuffer: 2 CPU-time to append 'C': 1
Read in D: CPU-time to find end of StringBuffer: 3 CPU-time to append 'D': 1
读入 D: CPU-time to find end StringBuffer: 3 CPU-time to append 'D': 1
CPU-time to copy StringBuffer to String at the end: 4
最后将 StringBuffer 复制到 String 的 CPU 时间:4
Total CPU-time = 1 + 2 + 3 + 4 + 4
总 CPU 时间 = 1 + 2 + 3 + 4 + 4
If we generalise this to n 1-letter words:
如果我们将其概括为 n 个 1 个字母的单词:
1 + 2 + 3 + ...... + n + n = 0.5n(n+1) + n
1 + 2 + 3 + ...... + n + n = 0.5n(n+1) + n
I did this by using the formula for the sum of an arithmetic sequence.
我通过使用等差数列之和的公式来做到这一点。
O(0.5n^2 + 1.5n) = O(n^2).
O(0.5n^2 + 1.5n) = O(n^2)。
If we use multi-letter words, we are going to have to find the end of the StringBuffer less frequently, leading to a lower CPU-time and a 'better' case.
如果我们使用多字母单词,我们将不得不不那么频繁地找到 StringBuffer 的结尾,从而导致更低的 CPU 时间和“更好”的情况。