Java 字符串真的不可变吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20945049/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Is a Java string really immutable?
提问by Darshan Patel
We all know that String
is immutable in Java, but check the following code:
我们都知道String
在 Java 中它是不可变的,但请检查以下代码:
String s1 = "Hello World";
String s2 = "Hello World";
String s3 = s1.substring(6);
System.out.println(s1); // Hello World
System.out.println(s2); // Hello World
System.out.println(s3); // World
Field field = String.class.getDeclaredField("value");
field.setAccessible(true);
char[] value = (char[])field.get(s1);
value[6] = 'J';
value[7] = 'a';
value[8] = 'v';
value[9] = 'a';
value[10] = '!';
System.out.println(s1); // Hello Java!
System.out.println(s2); // Hello Java!
System.out.println(s3); // World
Why does this program operate like this? And why is the value of s1
and s2
changed, but not s3
?
为什么这个程序会这样运行?为什么是价值s1
和s2
改变,但不是s3
?
采纳答案by haraldK
String
is immutable* but this only means you cannot change it using its public API.
String
是不可变的* 但这仅意味着您无法使用其公共 API 更改它。
What you are doing here is circumventing the normal API, using reflection. The same way, you can change the values of enums, change the lookup table used in Integer autoboxing etc.
您在这里所做的是使用反射来绕过正常的 API。同样,您可以更改枚举的值,更改整数自动装箱中使用的查找表等。
Now, the reason s1
and s2
change value, is that they both refer to the same interned string. The compiler does this (as mentioned by other answers).
现在,原因s1
和s2
更改值是它们都引用同一个内部字符串。编译器执行此操作(如其他答案所述)。
The reason s3
does notwas actually a bit surprising to me, as I thought it would share the value
array (it did in earlier version of Java, before Java 7u6). However, looking at the source code of String
, we can see that the value
character array for a substring is actually copied (using Arrays.copyOfRange(..)
). This is why it goes unchanged.
究其原因s3
不不,实际上一点令我感到诧异,因为我认为这将共享value
阵列(它确实在Java的早期版本,Java的7u6之前)。但是,查看 的源代码String
,我们可以看到value
实际上复制了子字符串的字符数组(使用Arrays.copyOfRange(..)
)。这就是它不变的原因。
You can install a SecurityManager
, to avoid malicious code to do such things. But keep in mind that some libraries depend on using these kind of reflection tricks (typically ORM tools, AOP libraries etc).
您可以安装一个SecurityManager
, 以避免恶意代码做这样的事情。但请记住,某些库依赖于使用这些反射技巧(通常是 ORM 工具、AOP 库等)。
*) I initially wrote that String
s aren't really immutable, just "effective immutable". This might be misleading in the current implementation of String
, where the value
array is indeed marked private final
. It's still worth noting, though, that there is no way to declare an array in Java as immutable, so care must be taken not to expose it outside its class, even with the proper access modifiers.
*) 我最初写道String
s 并不是真正不可变的,只是“有效的不可变”。这在 的当前实现中可能会产生误导String
,其中value
确实标记了数组private final
。不过,仍然值得注意的是,在 Java 中无法将数组声明为不可变的,因此必须注意不要将其暴露在其类之外,即使使用适当的访问修饰符也是如此。
As this topic seems overwhelmingly popular, here's some suggested further reading: Heinz Kabutz's Reflection Madness talkfrom JavaZone 2009, which covers a lot of the issues in the OP, along with other reflection... well... madness.
由于这个主题似乎非常受欢迎,这里有一些建议进一步阅读:Heinz Kabutz在 JavaZone 2009 上的 Reflection Madness 演讲,其中涵盖了 OP 中的许多问题,以及其他反射......好吧......疯狂。
It covers why this is sometimes useful. And why, most of the time, you should avoid it. :-)
它涵盖了为什么这有时很有用。为什么,大多数时候,你应该避免它。:-)
回答by Ankur
You are using reflection to access the "implementation details" of string object. Immutability is the feature of the public interface of an object.
您正在使用反射来访问字符串对象的“实现细节”。不变性是对象的公共接口的特性。
回答by Bohemian
You are using reflection to circumvent the immutability of String - it's a form of "attack".
您正在使用反射来规避 String 的不变性 - 这是一种“攻击”形式。
There are lots of examples you can create like this (eg you can even instantiate a Void
objecttoo), but it doesn't mean that String is not "immutable".
您可以像这样创建很多示例(例如,您甚至可以实例化一个Void
对象),但这并不意味着 String 不是“不可变的”。
There are use cases where this type of code may be used to your advantage and be "good coding", such as clearing passwords from memory at the earliest possible moment (before GC).
在某些用例中,这种类型的代码可能对您有利并且是“良好的编码”,例如尽早(在 GC 之前)从内存中清除密码。
Depending on the security manager, you may not be able to execute your code.
根据安全管理器的不同,您可能无法执行您的代码。
回答by Krease
String immutability is from the interface perspective. You are using reflection to bypass the interface and directly modify the internals of the String instances.
字符串不变性是从接口的角度来看的。您正在使用反射绕过接口并直接修改 String 实例的内部结构。
s1
and s2
are both changed because they are both assigned to the same "intern" String instance. You can find out a bit more about that part from this articleabout string equality and interning. You might be surprised to find out that in your sample code, s1 == s2
returns true
!
s1
并且s2
都被更改,因为它们都被分配给同一个“实习生”字符串实例。您可以从这篇关于字符串相等和实习的文章中找到更多关于该部分的信息。您可能会惊讶地发现,在您的示例代码中,s1 == s2
返回true
!
回答by Hauke Ingmar Schmidt
Visibility modifiers and final (i.e. immutability) are not a measurement against malicious code in Java; they are merely tools to protect against mistakes and to make the code more maintainable (one of the big selling points of the system). That is why you can access internal implementation details like the backing char array for String
s via reflection.
可见性修饰符和 final(即不变性)不是针对 Java 中恶意代码的度量;它们只是防止错误并使代码更易于维护的工具(系统的一大卖点)。这就是为什么您可以String
通过反射访问内部实现细节,例如s的支持字符数组。
The second effect you see is that all String
s change while it looks like you only change s1
. It is a certain property of Java String literals that they are automatically interned, i.e. cached. Two String literals with the same value will actually be the same object. When you create a String with new
it will not be interned automatically and you will not see this effect.
您看到的第二个效果是所有String
s 都发生了变化,而看起来您只是在变化s1
。Java 字符串文字的一个特定属性是它们会被自动嵌入,即缓存。具有相同值的两个字符串文字实际上是同一个对象。当你用new
它创建一个 String 时,它不会被自动实习,你也不会看到这种效果。
#substring
until recently (Java 7u6) worked in a similar way, which would have explained the behaviour in the original version of your question. It didn't create a new backing char array but reused the one from the original String; it just created a new String object that used an offset and a length to present only a part of that array. This generally worked as Strings are immutable - unless you circumvent that. This property of #substring
also meant that the whole original String couldn't be garbage collected when a shorter substring created from it still existed.
#substring
直到最近(Java 7u6)以类似的方式工作,这可以解释您问题的原始版本中的行为。它没有创建新的支持字符数组,而是重用了原始字符串中的数组;它只是创建了一个新的 String 对象,该对象使用偏移量和长度来仅显示该数组的一部分。这通常是因为字符串是不可变的 - 除非你规避它。这个属性#substring
也意味着当从它创建的更短的子字符串仍然存在时,整个原始字符串不能被垃圾收集。
As of current Java and your current version of the question there is no strange behaviour of #substring
.
截至当前的 Java 和您当前版本的问题,#substring
.
回答by Zaheer Ahmed
In Java, if two string primitive variables are initialized to the same literal, it assigns the same reference to both variables:
在 Java 中,如果两个字符串原始变量被初始化为相同的文字,它会为两个变量分配相同的引用:
String Test1="Hello World";
String Test2="Hello World";
System.out.println(test1==test2); // true
That is the reason the comparison returns true. The third string is created using substring()
which makes a new string instead of pointing to the same.
这就是比较返回 true 的原因。第三个字符串是使用substring()
which 创建一个新字符串而不是指向相同的字符串。
When you access a string using reflection, you get the actual pointer:
当您使用反射访问字符串时,您将获得实际的指针:
Field field = String.class.getDeclaredField("value");
field.setAccessible(true);
So change to this will change the string holding a pointer to it, but as s3
is created with a new string due to substring()
it would not change.
因此,对此进行更改将更改持有指向它的指针的字符串,但s3
由于substring()
它不会更改,因此使用新字符串创建。
回答by manikanta
Which version of Java are you using? From Java 1.7.0_06, Oracle has changed the internal representation of String, especially the substring.
您使用的是哪个版本的 Java?从 Java 1.7.0_06 开始,Oracle 改变了 String 的内部表示,尤其是子字符串。
Quoting from Oracle Tunes Java's Internal String Representation:
In the new paradigm, the String offset and count fields have been removed, so substrings no longer share the underlying char [] value.
在新范式中,字符串偏移量和计数字段已被删除,因此子字符串不再共享底层 char [] 值。
With this change, it may happen without reflection (???).
有了这个变化,它可能会在没有反思的情况下发生(???)。
回答by AbhijeetMishra
According to the concept of pooling, all the String variables containing the same value will point to the same memory address. Therefore s1 and s2, both containing the same value of “Hello World”, will point towards the same memory location (say M1).
根据池化的概念,所有包含相同值的 String 变量将指向相同的内存地址。因此,s1 和 s2 都包含相同的“Hello World”值,将指向相同的内存位置(比如 M1)。
On the other hand, s3 contains “World”, hence it will point to a different memory allocation (say M2).
另一方面,s3 包含“World”,因此它将指向不同的内存分配(比如 M2)。
So now what's happening is that the value of S1 is being changed (by using the char [ ] value). So the value at the memory location M1 pointed both by s1 and s2 has been changed.
所以现在发生的事情是 S1 的值正在改变(通过使用 char [ ] 值)。因此,s1 和 s2 所指向的内存位置 M1 处的值已更改。
Hence as a result, memory location M1 has been modified which causes change in the value of s1 and s2.
因此,结果是,内存位置 M1 已被修改,这会导致 s1 和 s2 的值发生变化。
But the value of location M2 remains unaltered, hence s3 contains the same original value.
但是位置 M2 的值保持不变,因此 s3 包含相同的原始值。
回答by SpacePrez
String is immutable, but through reflection you're allowed to change the String class. You've just redefined the String class as mutable in real-time. You could redefine methods to be public or private or static if you wanted.
String 是不可变的,但是通过反射你可以改变 String 类。您刚刚将 String 类重新定义为实时可变的。如果需要,您可以将方法重新定义为公共、私有或静态。
回答by Scott Wisniewski
There are really two questions here:
这里真的有两个问题:
- Are strings really immutable?
- Why is s3 not changed?
- 字符串真的不可变吗?
- 为什么s3没有改变?
To point 1: Except for ROM there is no immutable memory in your computer. Nowadays even ROM is sometimes writable. There is always some code somewhere (whether it's the kernel or native code sidestepping your managed environment) that can write to your memory address. So, in "reality", no they are not absolutelyimmutable.
要点1:除了ROM,您的计算机中没有不可变的内存。如今,甚至 ROM 有时也是可写的。总有一些代码可以写入您的内存地址(无论是内核还是本机代码绕过您的托管环境)。所以,在“现实”中,不,它们不是绝对不变的。
To point 2: This is because substring is probably allocating a new string instance, which is likely copying the array. It is possible to implement substring in such a way that it won't do a copy, but that doesn't mean it does. There are tradeoffs involved.
要点 2:这是因为 substring 可能正在分配一个新的字符串实例,这可能是复制数组。可以以不复制的方式实现子字符串,但这并不意味着它会复制。这涉及到权衡。
For example, should holding a reference to reallyLargeString.substring(reallyLargeString.length - 2)
cause a large amount of memory to be held alive, or only a few bytes?
例如,应该持有一个引用reallyLargeString.substring(reallyLargeString.length - 2)
导致大量内存保持活动状态,还是只有几个字节?
That depends on how substring is implemented. A deep copy will keep less memory alive, but it will run slightly slower. A shallow copy will keep more memory alive, but it will be faster. Using a deep copy can also reduce heap fragmentation, as the string object and its buffer can be allocated in one block, as opposed to 2 separate heap allocations.
这取决于子字符串的实现方式。深拷贝将保持较少的内存可用,但运行速度会稍慢。浅拷贝将使更多内存保持活动状态,但速度会更快。使用深拷贝还可以减少堆碎片,因为字符串对象及其缓冲区可以在一个块中分配,而不是 2 个单独的堆分配。
In any case, it looks like your JVM chose to use deep copies for substring calls.
在任何情况下,看起来您的 JVM 都选择对子字符串调用使用深拷贝。