java java如何在底层实现字符串的享元模式?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2909848/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 23:22:53  来源:igfitidea点击:

How does java implement flyweight pattern for string under the hood?

javadesign-patternsflyweight-pattern

提问by Dan

If you have two instances of a String, and they are equal, in Java they will share the same memory. How is this implemented under the hood?

如果你有一个 String 的两个实例,并且它们是相等的,那么在 Java 中它们将共享相同的内存。这是如何在幕后实施的?

EDIT: My application uses a large number of String objects, many of which are identical. What is the best way to make use of Java String constant pool, as to avoid creating custom flyweight implementation?

编辑:我的应用程序使用了大量 String 对象,其中许多是相同的。使用 Java String 常量池以避免创建自定义享元实现的最佳方法是什么?

采纳答案by meriton

Look at the source code of java.lang.String(the source for entire java api is part of the JDK).

看源码java.lang.String(整个java api的源码都是JDK的一部分)。

To summarize: A String wraps a subsequence of a char[]. That backing char[]is never modified. This is accomplished by neither leaking nor capturing this char[]outside the Stringclass. However, several Stringscan share the same char[](see Implementation of String.substring).

总结一下:一个 String 包装了 a 的子序列char[]。该支持char[]永远不会被修改。这是通过char[]String类外既不泄漏也不捕获 this 来实现的。但是,几个Strings可以共享相同的char[](参见 的实现String.substring)。

There is also the mechanism of interning, as explained in the other answers.

如其他答案中所述,还有实习机制。

回答by matt b

If you have two instances of a String, and they are equal, in Java they will share the same memory

如果你有一个 String 的两个实例,并且它们是相等的,那么在 Java 中它们将共享相同的内存

This is actually not 100% true.

这实际上并非 100% 正确。

This blog post is a decent explanationof why this is so, and what the String constant poolis.

这篇博文很好地解释了为什么会这样,以及String 常量池是什么。

回答by Bill the Lizard

String literals are interned in Java, so there's really only one String object with multiple references (when they are equal, which is not always the case). See the java.net article All about intern()for more details.

字符串字面量在 Java 中是固定的,因此实际上只有一个 String 对象具有多个引用(当它们相等时,情况并非总是如此)。有关更多详细信息,请参阅 java.net 文章All about intern()

There's also a good example/explanation in section 3.10.5 String Literalsof the JLS that talks about when Strings are interned and when they'll be distinct.

在 JLS 的第3.10.5字符串文字中还有一个很好的示例/解释,它讨论了字符串何时被实习以及它们何时是不同的。

回答by cletus

That's not necessary true. Example:

这不是必须的。例子:

String s1 = "hello";
String s2 = "hello";
System.out.println(s1 == s2); // true

but:

但:

String s1 = new String("hello");
String s2 = new String("hello");
System.out.println(s1 == s2); // false

Now the second form is discouraged. Some (including me) think that Stringshouldn't even have a public constructor. A better version of the above would be:

现在不鼓励第二种形式。有些人(包括我)认为这String甚至不应该有公共构造函数。上述更好的版本是:

String s1 = new String("hello").intern();
String s2 = new String("hello").intern();
System.out.println(s1 == s2); // true

Obviously you don't need to do this for a constant String. It's illustrative.

显然,您不需要为常量执行此操作String。这是说明性的。

The important point about this is that if you're passed a Stringor get one from a function you can't rely on the Stringbeing canonical. A canonicalObjectsatisfies this equality:

关于这个最重要的一点是,如果你传入一个String或一个函数得到一个你不能依赖String规范。一个规范Object满足这个等式:

a.equals(b) == b.equals(a) == (a == b)

for non-nullinstances a, b,of a given Class.

用于非null实例ab,给定的Class

回答by Yishai

To answer your edited question, Sun JVMs have a -XX:+StringCacheoption, which in my observation can reduce the memory footprint of a String heavy application significantly.

为了回答您编辑的问题,Sun JVM 有一个-XX:+StringCache选项,据我观察,它可以显着减少 String 繁重应用程序的内存占用。

Otherwise, you have the option of interning your Strings, but I would be careful about that. Strings that are very large and no longer referenced will still use memory for the life of the JVM.

否则,你可以选择实习你的字符串,但我会小心的。非常大且不再被引用的字符串仍将在 JVM 的生命周期内使用内存。

Edit (in response to comment): I first found out about the StringCache option from here:

编辑(回应评论):我首先从这里发现了 StringCache 选项:

-XX:+StringCache Enables caching of commonly allocated strings.

-XX:+StringCache 启用缓存常用分配的字符串。

Tom Hawtindescribes some type of caching to improve some benchmarks. My observation when I put it on IDEA was that the memory footprint (after a full garbage collection) went way down over not having it. It is not a documented parameter, and may indeed just be about optimizing for some benchmarks. My observation is that it helped, but I wouldn't build an important system based on it.

Tom Hawtin描述了某种类型的缓存来改进一些基准。当我把它放在 IDEA 上时,我的观察是内存占用(在完全垃圾收集之后)因为没有它而下降了。它不是记录在案的参数,实际上可能只是针对某些基准进行优化。我的观察是它有帮助,但我不会基于它构建一个重要的系统。

回答by Juha Syrj?l?

Two things to be careful about:

需要注意的两件事:

  1. Do not use new String("abc")constructor, just use the literal "abc".
  2. Learn to use intern()method in String class. Especially when concatenating strings together or when converting char array/byte array/etc to a String.
  1. 不要使用new String("abc")构造函数,只需使用文字"abc".
  2. 学习在 String 类中使用intern()方法。特别是在将字符串连接在一起或将字符数组/字节数组/等转换为字符串时。

intern()returns always strings that are pooled.

intern()始终返回合并的字符串。

回答by Jason

If your identical Strings come from a fixed set of possible values, then a Type-Safe Enumeration is what you want here. Not only will it reduce your String count, but it will make for a more solid application. Your whole app will know this String has semantics attached to it, maybe even some convenience methods.

如果您的相同字符串来自一组固定的可能值,那么类型安全枚举就是您想要的。它不仅会减少您的字符串数量,还会使应用程序更加可靠。你的整个应用程序都会知道这个 String 附加了语义,甚至可能是一些方便的方法。

My favorite optimizations are always the ones that can be defended as making the code better, not just faster. And 9 times out of 10, replacing a String with a concrete type leads to more correct and self-documenting code.

我最喜欢的优化总是那些可以被辩护为使代码更好,而不仅仅是更快的优化。10 次中有 9 次,用具体类型替换 String 会导致更正确和自我记录的代码。