java string 类的 subString() 函数是如何工作的
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/704319/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
how the subString() function of string class works
提问by harshit
please see the following code.
请看下面的代码。
String s = "Monday";
if(s.subString(0,3).equals("Mon"){}
String s2 = new String(s.subString(0,3));
String s3 = s.subString(0,3);
I know that line 2 will still point to "Monday" and have a new String object with the offset and count set to 0,3.
我知道第 2 行仍将指向“星期一”,并且有一个新的 String 对象,其偏移量和计数设置为 0,3。
The line 4 will create a new String "Mon" in string pool and point to it.
第 4 行将在字符串池中创建一个新的字符串“Mon”并指向它。
But not sure what about line 5 whether it will behave like line 2 or line 4.
但不确定第 5 行是否会像第 2 行或第 4 行那样表现。
If i am wrong for line 2 or 4 also please correct..
如果我对第 2 行或第 4 行有误,请更正..
回答by Jon Skeet
As pointed out by Pete Kirkham, this is implementation specific. My answer is only correct for the Sun JRE, and only prior to Java 7 update 6.
正如 Pete Kirkham 所指出的,这是特定于实现的。我的回答仅适用于 Sun JRE,并且仅在 Java 7 update 6 之前。
You're right about a normal substringcall just creating a new string referring to the same character array as the original string. That's what happens on line 5 too. The fact that the new string object reference happens to be assigned to a variable doesn't change the behaviour of the method.
您对正常substring调用只是创建一个新字符串是正确的,该字符串引用与原始字符串相同的字符数组。这也是第 5 行发生的情况。新的字符串对象引用恰好被分配给一个变量这一事实不会改变该方法的行为。
Just to be clear, you say that in line 2 the new string will still point to "Monday" - the char array reference inside the string will be to the same char array as one used for "Monday". But "Monday" is a string in itself, not a char array. In other words, by the time line 2 has finished (and ignoring GC) there are two string objects, both referring to the same char array. One has a count of 6 and the other has a count of 3; both have an offset of 0.
为了清楚起见,您说在第 2 行中,新字符串仍将指向“星期一”——字符串内的字符数组引用将指向与用于“星期一”的字符数组相同的字符数组。但是“星期一”本身就是一个字符串,而不是一个字符数组。换句话说,到第 2 行完成(并忽略 GC)时,有两个字符串对象,都指向同一个字符数组。一个计数为 6,另一个计数为 3;两者的偏移量为 0。
You're wrong about line 4 using a "string pool" though - there's no pooling going on there. However, it is different to the other lines. When you call the String(String)constructor, the new string takes a copyof the character data of the original, so it's completely separate. This can be very useful if you only need a string which contains a small part of a very large original string; it allows the original large char array to be garbage collected (assuming nothing else needs it) while you hold onto the copy of the small portion. A good example of this in my own experience is reading lines from a line. By default, BufferedLineReaderwill read lines using an 80-character buffer, so every string returned will use a char array of at least 80 characters. If you're reading lots of very short lines (single words) the difference in terms of memory consumption just through the use of the odd-looking
但是,您对第 4 行使用“字符串池”的看法是错误的 - 那里没有进行池化。但是,它与其他线路不同。当您调用String(String)构造函数时,新字符串会获取原始字符数据的副本,因此它是完全独立的。如果您只需要一个包含非常大的原始字符串的一小部分的字符串,这将非常有用;它允许原始大字符数组被垃圾收集(假设没有其他东西需要它),同时您保留一小部分的副本。根据我自己的经验,一个很好的例子是从一行中读取行。默认情况下,BufferedLineReader将使用 80 个字符的缓冲区读取行,因此返回的每个字符串都将使用至少 80 个字符的字符数组。如果您正在阅读大量非常短的行(单个单词),那么仅通过使用奇怪的外观就可以看出内存消耗方面的差异
line = new String(line);
can be very significant.
可能非常重要。
Does that help?
这有帮助吗?
回答by Pete Kirkham
I know that line 2 will still point to "Monday" and have a new String object with the offset and count set to 0,3.
我知道第 2 行仍将指向“星期一”,并且有一个新的 String 对象,其偏移量和计数设置为 0,3。
That is currently true of the Sun JRE implementation. I seem to recall that was not true of the Sun implementation in the past, and is not true of other implementations of the JVM. Do not rely on behaviour which is not specified. GNU classpath might copy the array (I can't remember off hand what ratio is uses to decide when to copy, but it does copy if the copy is a small enough fraction of the original, which turned one nice O(N) algorithm to O(N^2)).
Sun JRE 实现目前确实如此。我似乎记得过去的 Sun 实现不是这样,JVM 的其他实现也不是这样。不要依赖未指定的行为。GNU 类路径可能会复制数组(我不记得使用什么比率来决定何时复制,但是如果副本是原始副本的足够小部分,它会进行复制,这将一个不错的 O(N) 算法变成O(N^2))。
The line 4 will create a new String "Mon" in string pool and point to it.
第 4 行将在字符串池中创建一个新的字符串“Mon”并指向它。
No, it creates a new string object in the heap, subject to the same garbage collection rules as any other object. Whether or not it shares the same underlying character array is implementation dependant. Do not rely on behaviour which is not specified.
不,它在堆中创建一个新的字符串对象,遵循与任何其他对象相同的垃圾收集规则。它是否共享相同的底层字符数组取决于实现。不要依赖未指定的行为。
The String(String)constructor says:
该String(String)构造说:
Initializes a newly created String object so that it represents the same sequence of charactersas the argument; in other words, the newly created string is a copy of the argument string.
初始化新创建的 String 对象,使其表示与参数相同的字符序列;换句话说,新创建的字符串是参数字符串的副本。
The String(char[])constructor says:
该String(char[])构造说:
Allocates a new String so that it represents the sequence of characters currently contained in the character array argument. The contents of the character arrayare copied; subsequent modification of the character array does not affect the newly created string.
分配一个新字符串,以便它表示当前包含在字符数组参数中的字符序列。复制字符数组的内容;字符数组的后续修改不会影响新创建的字符串。
Following good OO principles, no method of Stringactually requires that it is implemented using a character array, so no part of the specification of Stringrequires operations on an character array. Those operations which take an array as input specify that the contentsof the array are copied to whatever internal storage is used in the String. A string could use UTF-8 or LZ compression internally and conform to the API.
遵循良好的 OO 原则,String实际上没有任何方法要求它使用字符数组来实现,因此规范的任何部分都不String需要对字符数组进行操作。将数组作为输入的那些操作指定将数组的内容复制到字符串中使用的任何内部存储。字符串可以在内部使用 UTF-8 或 LZ 压缩并符合 API。
However, if your JVM doesn't make the small-ratio sub-string optimisation, then there's a chance that it does copy only the relevant portion when you use new String(String), so it's a case of trying it a seeing if it improves the memory use. Not everything which effects Java runtimes is defined by Java.
但是,如果您的 JVM 没有进行小比率子字符串优化,那么当您使用 时,它有可能只复制相关部分new String(String),因此可以尝试看看它是否可以改善内存使用。并非所有影响 Java 运行时的东西都是由 Java 定义的。
To obtain a string in the string pool which is equalto a string, use the intern()method. This will either retrieve a string from the pool if one with the value already has been interned, or create a new string and put it in the pool. Note that pooled strings have different (again implementation dependent) garbage collection behaviour.
获取字符串池中对应equal字符串的字符串,使用intern()方法。这将从池中检索一个字符串,如果一个字符串已经被实习,或者创建一个新字符串并将其放入池中。请注意,池字符串具有不同的(同样依赖于实现)垃圾收集行为。
回答by Boann
Note: As of Java 7 update 6 in Sun/Oracle's Java, it is no longer true that a String created by String.substring shares the parent's char array. It was decided that this optimization was rarely beneficial, and did not justify the cost and complexity of the offsetand countfields.
注意:从 Sun/Oracle 的 Java 中的 Java 7 更新 6 开始,由 String.substring 创建的字符串不再共享父的字符数组。决定这种优化很少有好处,并且不能证明offset和count字段的成本和复杂性。
Some links:
一些链接:
- Rationale: http://mail.openjdk.java.net/pipermail/core-libs-dev/2012-May/010257.html
- Formal bug report: http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6924259
- String.java diff: http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/diff/e1c679a00712/src/share/classes/java/lang/String.java
- 基本原理:http: //mail.openjdk.java.net/pipermail/core-libs-dev/2012-May/010257.html
- 正式的错误报告:http: //bugs.sun.com/bugdatabase/view_bug.do?bug_id=6924259
- String.java 差异:http://hg.openjdk.java.net/jdk7u/jdk7u/jdk/diff/e1c679a00712/src/share/classes/java/lang/String.java
回答by Warrior
At line 5---->s3=Mon .
在第 5 行---->s3=Mon 。
回答by amitkumar12788
“Substring creates a new object out of source string by taking a portion of original string”.
“子字符串通过获取原始字符串的一部分,从源字符串中创建一个新对象”。
Until Java 1.7, substring holds the reference of the original character array, which means even a sub-string of 5 characters long, can prevent 1GB character array from garbage collection, by holding a strong reference.
在 Java 1.7 之前, substring 持有原始字符数组的引用,这意味着即使是 5 个字符长的子字符串,也可以通过持有强引用来防止 1GB 字符数组被垃圾回收。
This issue is fixed in Java 1.7, where original character array is not referenced anymore, but that change also made the creation of substring bit costly in terms of time. Earlier it was in the range of O(1), which could be O(n) in worst case on Java 7.
此问题已在 Java 1.7 中得到修复,其中不再引用原始字符数组,但该更改也使子字符串位的创建在时间方面成本高昂。早些时候它在 O(1) 的范围内,在 Java 7 的最坏情况下可能是 O(n)。
回答by Chei
In Sun's implementation String objects have a private final char value[]field. When you create a new String by calling substring(), no new char array is created, the new instance uses the valueof the original object. This is the case in line 2 and 5, the new String objects will use the char array of s.
在 Sun 的实现中,String 对象有一个private final char value[]字段。当您通过调用 substring() 创建新 String 时,不会创建新的 char 数组,新实例使用value原始对象的 。第 2 行和第 5 行就是这种情况,新的 String 对象将使用 s 的 char 数组。
The constructor String(String) creates a new char array in case of the string length being less than the total length of the char array value. So the String created in line 4 will use a new char array.
如果字符串长度小于 char 数组的总长度,构造函数 String(String) 会创建一个新的 char 数组value。所以第 4 行中创建的 String 将使用一个新的 char 数组。
You should have a look at the source codeof the constructor public String(String original), it's really simple.
你应该看看构造函数public String(String original)的源码,真的很简单。
回答by Tobias
read this http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html
阅读这个http://java.sun.com/j2se/1.4.2/docs/api/java/lang/String.html
"Returns a new string..."
“返回一个新的字符串……”


