java - 垃圾收集器如何快速知道哪些对象不再有对它们的引用?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10587868/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
java - How can Garbage Collector quickly know which objects do not have references to them any more?
提问by Hymanson Tale
I understand that in Java, if an object doesn't have any references to it any more, the garbage collector will reclaim it back some time later.
我知道在 Java 中,如果一个对象不再有任何引用,垃圾收集器将在一段时间后回收它。
But how does the garbage collector know that an object has or has not references associated to it?
但是垃圾收集器如何知道一个对象有或没有关联的引用呢?
Is garbage collector using some kind of hashmap or table?
垃圾收集器是否使用某种哈希图或表?
Edit:
编辑:
Please note that I am not asking how generally gc works. really, I am not asking that.
请注意,我不是在问 gc 的工作原理。真的,我不是在问那个。
I am asking specificallythat How gc knows which objects are live and which are dead, with efficiencies.
我特别问的是 gc 如何知道哪些对象是活的,哪些是死的,效率高。
That's why I say in my question that is gc maintain some kind of hashmap or set, and consistently update the number of references an object has?
这就是为什么我在我的问题中说 gc 维护某种哈希映射或集合,并始终更新对象具有的引用数?
采纳答案by NPE
A typical modern JVM uses several different types of garbage collectors.
典型的现代 JVM 使用几种不同类型的垃圾收集器。
One type that's often used for objects that have been around for a while is called Mark-and-Sweep. It basically involves starting from known "live" objects (the so-called garbage collection roots), following all chains of object references, and marking every reachable object as "live".
一种常用于已存在一段时间的对象的类型称为Mark-and-Sweep。它基本上涉及从已知的“活动”对象(所谓的垃圾收集根)开始,遵循所有对象引用链,并将每个可访问的对象标记为“活动”。
Once this is done, the sweepstage can reclaim those objects that haven't been marked as "live".
完成此操作后,扫描阶段可以回收那些尚未标记为“活动”的对象。
For this process to work, the JVM has to know the location in memory of every object reference. This is a necessary condition for a garbage collector to be precise(which Java's is).
为了使这个过程起作用,JVM 必须知道每个对象引用在内存中的位置。这是垃圾收集器精确(Java 就是如此)的必要条件。
回答by Mark Booth
Java has a variety of different garbage collection strategies, but they all basically work by keeping track which objects are reachablefrom known active objects.
Java 有多种不同的垃圾收集策略,但它们基本上都是通过跟踪可以从已知活动对象访问哪些对象来工作的。
A great summary can be found in the article How Garbage Collection works in Javabut for the real low-down, you should look at Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine
可以在文章How Garbage Collection works in Java 中找到一个很好的总结,但对于真正的底层,您应该查看Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine
An object is considered garbage when it can no longer be reached from any pointer in the running program. The most straightforward garbage collection algorithms simply iterate over every reachable object. Any objects left over are then considered garbage. The time this approach takes is proportional to the number of live objects, which is prohibitive for large applications maintaining lots of live data.
Beginning with the J2SE Platform version 1.2, the virtual machine incorporated a number of different garbage collection algorithms that are combined using generational collection. While naive garbage collection examines every live object in the heap, generational collection exploits several empirically observed properties of most applications to avoid extra work.
The most important of these observed properties is infant mortality. ...
当一个对象不能再从正在运行的程序中的任何指针到达时,它就被认为是垃圾。最直接的垃圾收集算法简单地遍历每个可到达的对象。任何剩余的对象都被视为垃圾。这种方法所花费的时间与活动对象的数量成正比,这对于维护大量活动数据的大型应用程序来说是令人望而却步的。
从 J2SE 平台 1.2 版开始,虚拟机合并了许多不同的垃圾收集算法,这些算法使用分代收集进行组合。虽然朴素的垃圾收集检查堆中的每个活动对象,但分代收集利用大多数应用程序的几个经验观察属性来避免额外的工作。
这些观察到的特性中最重要的是婴儿死亡率。...
I.e. many objects like iterators only live for a very short time, so youngerobjects are more likely to be eligible for garbage collection than much older objects.
即像迭代器这样的许多对象只能存活很短的时间,所以年轻的对象比更老的对象更有可能进行垃圾收集。
For more up to date tuning guides, take a look at:
有关更多最新的调整指南,请查看:
- Java SE 6 HotSpot[tm] Virtual Machine Garbage Collection Tuning
- Java Platform, Standard Edition HotSpot Virtual Machine Garbage Collection Tuning Guide(Java SE 8)
Incidentally, be careful of trying to second guess your garbage collection strategy, I've known many a programs performance for be trashed by over zealous use of System.gc()
or inappropriate -XX
options.
顺便说一句,请小心尝试重新猜测您的垃圾收集策略,我知道许多程序性能因过度使用System.gc()
或不适当的-XX
选项而被破坏。
回答by AlexR
GC will know that object can be removed as quickly as it is possible. You are not expected to manage this process.
GC 将知道可以尽快删除对象。您不需要管理此流程。
But you can ask GC very politely to run using System.gc()
. It is just a tip to the system. GC does not have to run at that moment, it does not have to remove your specific object etc. Because GC is the BIG boss and we (Java programmers) are just its slaves... :(
但是您可以非常礼貌地要求 GC 使用System.gc()
. 这只是系统的一个提示。GC 不必在那一刻运行,它不必删除您的特定对象等。因为 GC 是大老板,而我们(Java 程序员)只是它的奴隶...... :(
回答by Eugene
There is no efficientway - it will still require traversal of the heap, butthere is a hacky way: when the heap is divided into smaller pieces (thus no need to scan the entire heap). This is the reason we have generational garbage collectors, so that the scanning takes less time.
没有有效的方法 - 它仍然需要遍历堆,但是有一种很笨的方法:将堆分成更小的部分(因此不需要扫描整个堆)。这就是我们有分代垃圾收集器的原因,以便扫描花费更少的时间。
This is relatively "easy" to answer when your entire application is stopped and you can analyze the graph of objects. It all starts from GC roots
(I'll let you find the documentation for what these are), but basically these are "roots" that are not collected by the GC
.
当您的整个应用程序停止并且您可以分析对象图时,这相对“容易”回答。这一切都从GC roots
(我会让你找到这些是什么的文档)开始,但基本上这些是GC
.
From here a certain scan starts that analyzes the "live" objects: objects that have a direct (or transitive) connection to these roots, thus not reclaimable. In graph theory this is know to "color/traverse" your graph by using 3 colors: black, grey and white. White
means it is notconnected to the roots, grey
means it's sub-graph is not yet traversed, black
means traversed and connected to the roots. So basically to know what exactly is dead/alive right now - you simply need to take all your heap that is white initially and color it to black. Everything that is white
is garbage. It is interesting that "garbage" is really identified by a GC
by knowing what is actually alive. There are some drawings to visualize this herefor example.
从这里开始某个扫描,分析“活动”对象:与这些根有直接(或传递)连接的对象,因此不可回收。在图论中,这被称为使用 3 种颜色来“着色/遍历”您的图:黑色、灰色和白色。White
意味着它没有连接到根,grey
意味着它的子图尚未遍历,black
意味着遍历并连接到根。所以基本上要知道现在到底什么是死/活的 - 你只需要把所有最初是白色的堆,然后把它涂成黑色。一切white
都是垃圾。有趣的是,“垃圾”实际上是GC
通过知道什么是活着的来识别的。例如。
But this is the simple scenario: when your application is entirely stopped (for seconds at times) and you can scan the heap. This is called a STW
- stop the world event and people hate these usually. This is what parallel collectors do: stop everything, do whatever GC has to (including finding garbage), let the application threads start after that.
但这是一个简单的场景:当您的应用程序完全停止(有时几秒钟)并且您可以扫描堆时。这被称为STW
- 停止世界事件,人们通常讨厌这些。这就是并行收集器所做的:停止一切,做任何 GC 必须做的事情(包括寻找垃圾),然后让应用程序线程启动。
What happens when you app is running and you are scanning the heap? Concurrently
? G1/CMS
do this. Think about it: how can you reason about a leaf from a graph being alive or not when your app can change that leaf via a different thread.
当您的应用程序正在运行并且您正在扫描堆时会发生什么?Concurrently
? G1/CMS
做这个。想一想:当您的应用程序可以通过不同的线程更改叶子时,您如何推断图中的叶子是否存在。
Shenandoah
for example, solves this by "intercepting" changes over the graph. While running concurrently with your application, it will catch all the changes and insert these to some thread local special queues, called SATB Queues
(snapshot at the begging queues); instead of altering the heap directly. When that is finished, a very short STW
event will occur and these queues will be drained. Stillunder the STW
what that drain has "caused" is computed, i.e. : extra coloring of the graph. This is far simplified, just FYI. G1
and CMS
do it differently AFAIK.
Shenandoah
例如,通过“拦截”图形上的变化来解决这个问题。在与您的应用程序并发运行时,它将捕获所有更改并将这些更改插入到一些线程本地特殊队列中,称为SATB Queues
(请求队列中的快照);而不是直接改变堆。完成后,STW
将发生一个非常短的事件,这些队列将被清空。仍然在STW
计算排水管“导致”的情况下,即:图形的额外着色。 这非常简单,仅供参考。G1
并以CMS
不同的方式做 AFAIK。
So in theory, the process is not really that complicated, but implementing it concurrently is the most challenging part.
所以从理论上讲,这个过程并没有那么复杂,但同时实现它是最具挑战性的部分。