C# 了解 .NET 中的垃圾收集

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17130382/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 08:40:41  来源:igfitidea点击:

Understanding garbage collection in .NET

c#.netgarbage-collection

提问by Victor Mukherjee

Consider the below code:

考虑以下代码:

public class Class1
{
    public static int c;
    ~Class1()
    {
        c++;
    }
}

public class Class2
{
    public static void Main()
    {
        {
            var c1=new Class1();
            //c1=null; // If this line is not commented out, at the Console.WriteLine call, it prints 1.
        }
        GC.Collect();
        GC.WaitForPendingFinalizers();
        Console.WriteLine(Class1.c); // prints 0
        Console.Read();
    }
}

Now, even though the variable c1 in the main method is out of scope and not referenced further by any other object when GC.Collect()is called, why is it not finalized there?

现在,即使 main 方法中的变量 c1 超出范围并且在GC.Collect()调用时没有被任何其他对象进一步引用,为什么它没有在那里最终确定?

采纳答案by Hans Passant

You are being tripped up here and drawing very wrong conclusions because you are using a debugger. You'll need to run your code the way it runs on your user's machine. Switch to the Release build first with Build + Configuration manager, change the "Active solution configuration" combo in the upper left corner to "Release". Next, go into Tools + Options, Debugging, General and untick the "Suppress JIT optimization" option.

你被绊倒在这里并得出非常错误的结论,因为你正在使用调试器。您需要按照在用户机器上运行的方式运行代码。首先使用Build + Configuration manager切换到Release build,将左上角的“Active solution configuration”组合改为“Release”。接下来,进入工具 + 选项、调试、常规并取消勾选“抑制 JIT 优化”选项。

Now run your program again and tinker with the source code. Note how the extra braces have no effect at all. And note how setting the variable to null makes no difference at all. It will always print "1". It now works the way you hope and expected it would work.

现在再次运行您的程序并修改源代码。请注意额外的大括号如何根本不起作用。并注意如何将变量设置为 null 根本没有区别。它将始终打印“1”。它现在以您希望和预期的方式工作。

Which does leave with the task of explaining why it works so differently when you run the Debug build. That requires explaining how the garbage collector discovers local variables and how that's affected by having a debugger present.

这确实需要解释为什么在运行调试版本时它的工作方式如此不同。这需要解释垃圾收集器如何发现局部变量以及调试器如何影响局部变量。

First off, the jitter performs twoimportant duties when it compiles the IL for a method into machine code. The first one is very visible in the debugger, you can see the machine code with the Debug + Windows + Disassembly window. The second duty is however completely invisible. It also generates a table that describes how the local variables inside the method body are used. That table has an entry for each method argument and local variable with two addresses. The address where the variable will first store an object reference. And the address of the machine code instruction where that variable is no longer used. Also whether that variable is stored on the stack frame or a cpu register.

首先,抖动在将方法的 IL 编译为机器代码时执行两项重要任务。第一个在调试器中非常明显,可以通过Debug + Windows + Disassembly 窗口看到机器码。然而,第二个职责是完全不可见的。它还生成一个表,描述如何使用方法主体内的局部变量。该表为每个方法参数和具有两个地址的局部变量都有一个条目。变量将首先存储对象引用的地址。以及不再使用该变量的机器代码指令的地址。此外,该变量是否存储在堆栈帧或 cpu 寄存器中。

This table is essential to the garbage collector, it needs to know where to look for object references when it performs a collection. Pretty easy to do when the reference is part of an object on the GC heap. Definitely not easy to do when the object reference is stored in a CPU register. The table says where to look.

这个表对于垃圾收集器来说是必不可少的,它需要知道在执行收集时在哪里查找对象引用。当引用是 GC 堆上对象的一部分时,这很容易做到。当对象引用存储在 CPU 寄存器中时,绝对不容易做到。桌子上写着去哪里找。

The "no longer used" address in the table is very important. It makes the garbage collector very efficient. It can collect an object reference, even if it is used inside a method and that method hasn't finished executing yet. Which is very common, your Main() method for example will only ever stop executing just before your program terminates. Clearly you would not want any object references used inside that Main() method to live for the duration of the program, that would amount to a leak. The jitter can use the table to discover that such a local variable is no longer useful, depending on how far the program has progressed inside that Main() method before it made a call.

表中“不再使用”的地址非常重要。它使垃圾收集器非常高效。它可以收集对象引用,即使它在方法内部使用并且该方法尚未完成执行。这是很常见的,例如您的 Main() 方法只会在您的程序终止之前停止执行。显然,您不希望 Main() 方法中使用的任何对象引用在程序运行期间都存在,这将构成泄漏。抖动可以使用该表来发现这样的局部变量不再有用,这取决于程序在调用之前在 Main() 方法中的进展程度。

An almost magic method that is related to that table is GC.KeepAlive(). It is a veryspecial method, it doesn't generate any code at all. Its only duty is to modify that table. It extendsthe lifetime of the local variable, preventing the reference it stores from getting garbage collected. The only time you need to use it is to stop the GC from being to over-eager with collecting a reference, that can happen in interop scenarios where a reference is passed to unmanaged code. The garbage collector cannot see such references being used by such code since it wasn't compiled by the jitter so doesn't have the table that says where to look for the reference. Passing a delegate object to an unmanaged function like EnumWindows() is the boilerplate example of when you need to use GC.KeepAlive().

与该表相关的一个几乎神奇的方法是 GC.KeepAlive()。这是一种非常特殊的方法,它根本不生成任何代码。它唯一的职责是修改该表。它延伸局部变量的生命周期,防止它存储的引用被垃圾收集。您需要使用它的唯一时间是阻止 GC 过度热衷于收集引用,这可能发生在将引用传递给非托管代码的互操作场景中。垃圾收集器无法看到此类代码正在使用此类引用,因为它不是由抖动编译的,因此没有说明在何处查找引用的表。将委托对象传递给非托管函数(如 EnumWindows())是何时需要使用 GC.KeepAlive() 的样板示例。

So, as you can tell from your sample snippet after running it in the Release build, local variables canget collected early, before the method finished executing. Even more powerfully, an object can get collected while one of its methods runs if that method no longer refers to this. There is a problem with that, it is very awkward to debug such a method. Since you may well put the variable in the Watch window or inspect it. And it would disappearwhile you are debugging if a GC occurs. That would be very unpleasant, so the jitter is awareof there being a debugger attached. It then modifiesthe table and alters the "last used" address. And changes it from its normal value to the address of the last instruction in the method. Which keeps the variable alive as long as the method hasn't returned. Which allows you to keep watching it until the method returns.

因此,正如您在发布版本中运行后的示例代码片段所见,可以在方法完成执行之前尽早收集局部变量。更强大的是,如果该方法不再引用this,则可以在其方法之一运行时收集对象。有一个问题,调试这样的方法是很尴尬的。因为您可以将变量放在 Watch 窗口中或检查它。如果发生 GC ,它会在您调试时消失。这将是非常令人不快的,因此抖动知道附加了调试器。然后修改表并更改“上次使用”地址。并将其从其正常值更改为方法中最后一条指令的地址。只要方法没有返回,它就会使变量保持活动状态。这允许您继续观看它直到方法返回。

This now also explains what you saw earlier and why you asked the question. It prints "0" because the GC.Collect call cannot collect the reference. The table says that the variable is in use pastthe GC.Collect() call, all the way up to the end of the method. Forced to say so by having the debugger attached andby running the Debug build.

这现在也解释了您之前看到的内容以及您问这个问题的原因。它打印“0”,因为 GC.Collect 调用无法收集引用。该表说,该变量在使用过去的GC.Collect的()调用,直到方法结束所有的方式。通过附加调试器运行 Debug 版本被迫这么说。

Setting the variable to null does have an effect now because the GC will inspect the variable and will no longer see a reference. But make sure you don't fall in the trap that many C# programmers have fallen into, actually writing that code was pointless. It makes no difference whatsoever whether or not that statement is present when you run the code in the Release build. In fact, the jitter optimizer will removethat statement since it has no effect whatsoever. So be sure to not write code like that, even though it seemedto have an effect.

将变量设置为 null 现在确实有效果,因为 GC 将检查变量并且不再看到引用。但是请确保您不会落入许多 C# 程序员所落入的陷阱,实际上编写这些代码毫无意义。当您在 Release 版本中运行代码时,该语句是否存在没有任何区别。事实上,抖动优化器将删除该语句,因为它没有任何效果。所以一定不要写那样的代码,即使它看起来有效果。



One final note about this topic, this is what gets programmers in trouble that write small programs to do something with an Office app. The debugger usually gets them on the Wrong Path, they want the Office program to exit on demand. The appropriate way to do that is by calling GC.Collect(). But they'll discover that it doesn't work when they debug their app, leading them into never-never land by calling Marshal.ReleaseComObject(). Manual memory management, it rarely works properly because they'll easily overlook an invisible interface reference. GC.Collect() actually works, just not when you debug the app.

关于此主题的最后一个注意事项,这是让程序员在编写小程序以使用 Office 应用程序执行某些操作时遇到麻烦的原因。调试器通常会让他们走错路,他们希望 Office 程序按需退出。适当的方法是调用 GC.Collect()。但是当他们调试他们的应用程序时,他们会发现它不起作用,通过调用 Marshal.ReleaseComObject() 导致他们永远不会登陆。手动内存管理,它很少能正常工作,因为它们很容易忽略不可见的接口引用。GC.Collect() 实际上有效,只是在您调试应用程序时无效。

回答by R.C

[ Just wanted to add further on the Internals of Finalization process ]

[只是想进一步添加最终确定过程的内部]

So, you create an object and when the object is collected, the object's Finalizemethod should be called. But there is more to finalization than this very simple assumption.

因此,您创建了一个对象,并且在收集该对象时,Finalize应该调用该对象的方法。但是除了这个非常简单的假设之外,还有更多的事情需要完成。

SHORT CONCEPTS::

简短概念::

  1. Objects NOT implementing Finalizemethods, there Memory is reclaimed immediately,unless of course, they are not reacheable by
    application code anymore

  2. Objects implementing FinalizeMethod, The Concept/Implementation of Application Roots, Finalization Queue, Freacheable Queuecomes before they can be reclaimed.

  3. Any object is considered garbage if it is NOT reacheable by Application Code

  1. 没有实现Finalize方法的对象,会立即回收内存,当然,除非
    应用程序代码无法再访问​​它们

  2. 对象实施Finalize方法,概念/实施Application RootsFinalization QueueFreacheable Queue谈到他们可以被回收之前。

  3. 如果应用程序代码无法访问任何对象,则将其视为垃圾

Assume:: Classes/Objects A, B, D, G, H do NOT implement FinalizeMethod and C, E, F, I, J implement FinalizeMethod.

假设:类/对象 A、B、D、G、H 不实现Finalize方法,而 C、E、F、I、J 实现Finalize方法。

When an application creates a new object, the new operator allocates the memory from the heap. If the object's type contains a Finalizemethod, then a pointer to the object is placed on the finalization queue.

therefore pointers to objects C, E, F, I, J gets added to finalization queue.

The finalization queueis an internal data structure controlled by the garbage collector. Each entry in the queue points to an object that should have its Finalizemethod called before the object's memory can be reclaimed. Figure below shows a heap containing several objects. Some of these objects are reachable from the application's roots, and some are not. When objects C, E, F, I, and J were created, the .Net framework detects that these objects have Finalizemethods and pointers to these objects are added to the finalization queue.

enter image description here

当应用程序创建一个新对象时,new 运算符从堆中分配内存。如果对象的类型包含一个Finalize方法,则指向该对象的指针将放置在终结队列中

因此指向对象 C、E、F、I、J 的指针被添加到终结队列中。

结束队列是由垃圾收集器控制的内部数据结构。队列中的每个条目都指向一个对象,该对象应该Finalize在可以回收对象内存之前调用其方法。下图显示了一个包含多个对象的堆。其中一些对象可以从应用程序的根访问,有些不是。当创建对象 C、E、F、I 和 J 时,.Net 框架会检测到这些对象具有Finalize方法并将指向这些对象的指针添加到终结队列中

在此处输入图片说明

When a GC occurs(1st Collection), objects B, E, G, H, I, and J are determined to be garbage. Because A,C,D,F are still reacheable by Application Code depicted through arrows from yellow Box above.

当GC发生(第一次收集)时,对象B、E、G、H、I和J被确定为垃圾。因为 A、C、D、F 仍然可以通过上面黄色框箭头所示的应用程序代码访问。

The garbage collector scans the finalization queuelooking for pointers to these objects. When a pointer is found, the pointer is removed from the finalization queue and appended to the freachable queue("F-reachable").

The freachable queueis another internal data structure controlled by the garbage collector. Each pointer in the freachable queueidentifies an object that is ready to have its Finalizemethod called.

垃圾收集器扫描终结队列以寻找指向这些对象的指针。当找到一个指针时,该指针从终结队列中移除并附加到可访问队列(“F-reachable”)。

所述freachable队列是由垃圾收集器控制的另一内部数据结构。freachable 队列中的每个指针都标识一个准备好Finalize调用其方法的对象。

After the collection(1st Collection), the managed heap looks something similar to figure below. Explanation given below::
1.) The memory occupied by objects B, G, and H has been reclaimed immediately because these objects did not have a finalize method that needed to be called.

在集合(第一个集合)之后,托管堆看起来类似于下图。说明如下:
1.)对象B、G、H占用的内存被立即回收,因为这些对象没有需要调用的finalize方法

2.) However, the memory occupied by objects E, I, and J could not be reclaimed because their Finalizemethod has not been called yet.Calling the Finalize method is done by freacheable queue.

2.) 但是,对象 E、I 和 J 占用的内存无法回收,因为它们的Finalize方法还没有被调用。调用 Finalize 方法是由可访问队列完成的

3.) A,C,D,F are still reacheable by Application Code depicted through arrows from yellow Box above, So they will NOT be collected in any case

enter image description here

3.) A,C,D,F 仍然可以通过上面黄色框箭头所示的应用程序代码到达,因此在任何情况下都不会被收集

在此处输入图片说明

There is a special runtime thread dedicated to calling Finalize methods. When the freachable queue is empty (which is usually the case), this thread sleeps. But when entries appear, this thread wakes, removes each entry from the queue, and calls each object's Finalize method. The garbage collector compacts the reclaimable memory and the special runtime thread empties the freachablequeue, executing each object's Finalizemethod. So here finally is when your Finalize method gets executed

有一个专门用于调用 Finalize 方法的特殊运行时线程。当 freachable 队列为空时(通常是这种情况),该线程休眠。但是当条目出现时,该线程唤醒,从队列中删除每个条目,并调用每个对象的 Finalize 方法。垃圾收集器压缩可回收的内存,特殊的运行时线程清空可访问队列,执行每个对象的Finalize方法。所以最后是你的 Finalize 方法被执行的时候

The next time the garbage collector is invoked(2nd Collection), it sees that the finalized objects are truly garbage, since the application's roots don't point to it and the freachable queueno longer points to it(it's EMPTY too), Therefore the memory for the objects (E, I, J) are simply reclaimed from Heap.See figure below and compare it with figure just above

enter image description here

下一次垃圾收集器被调用时(第二次收集),它看到最终的对象是真正的垃圾,因为应用程序的根不指向它,并且可破坏队列不再指向它(它也是 EMPTY),因此对象(E,I,J)的内存只是从堆中回收。见下图并将其与上图进行比较

在此处输入图片说明

The important thing to understand here is that two GCs are required to reclaim memory used by objects that require finalization. In reality, more than two collections cab be even required since these objects may get promoted to an older generation

这里要理解的重要一点是,需要两次 GC 来回收需要终结对象使用的内存。实际上,甚至需要两个以上的集合,因为这些对象可能会被提升到老一代

NOTE::The freachable queueis considered to be a root just like global and static variables are roots. Therefore, if an object is on the freachable queue, then the object is reachable and is not garbage.

注::freachable队列被认为是根就像全局和静态变量是根。因此,如果一个对象在 freachable 队列中,那么该对象是可达的,并且不是垃圾。

As a last note, remember that debugging application is one thing, Garbage Collection is another thing and works differently. So far you can't FEEL garbage collection just by debugging applications, further if you wish to investigate Memory get started here.

最后要注意的是,请记住调试应用程序是一回事,垃圾收集是另一回事,并且工作方式不同。到目前为止,您无法仅通过调试应用程序来感受垃圾收集,如果您想进一步研究内存,请从这里开始。