在 Javascript 中减少垃圾收集器活动的最佳实践

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18364175/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-27 11:38:20  来源:igfitidea点击:

Best practices for reducing Garbage Collector activity in Javascript

javascriptgarbage-collection

提问by UpTheCreek

I have a fairly complex Javascript app, which has a main loop that is called 60 times per second. There seems to be a lot of garbage collection going on (based on the 'sawtooth' output from the Memory timeline in the Chrome dev tools) - and this often impacts the performance of the application.

我有一个相当复杂的 Javascript 应用程序,它有一个每秒调用 60 次的主循环。似乎有很多垃圾收集正在进行(基于 Chrome 开发工具中内存时间线的“锯齿”输出) - 这通常会影响应用程序的性能。

So, I'm trying to research best practices for reducing the amount of work that the garbage collector has to do. (Most of the information I've been able to find on the web regards avoiding memory leaks, which is a slightly different question - my memory is getting freed up, it's just that there's too much garbage collection going on.) I'm assuming that this mostly comes down to reusing objects as much as possible, but of course the devil is in the details.

因此,我正在尝试研究减少垃圾收集器必须执行的工作量的最佳实践。(我在网上找到的大部分信息都与避免内存泄漏有关,这是一个略有不同的问题 - 我的内存正在被释放,只是垃圾收集过多。)我假设这主要归结为尽可能多地重用对象,但问题当然在于细节。

The app is structured in 'classes' along the lines of John Resig's Simple JavaScript Inheritance.

该应用程序按照John Resig 的 Simple JavaScript Inheritance以“类”结构化。

I think one issue is that some functions can be called thousands of times per second (as they are used hundreds of times during each iteration of the main loop), and perhaps the local working variables in these functions (strings, arrays, etc.) might be the issue.

我认为一个问题是某些函数每秒可以调用数千次(因为它们在主循环的每次迭代中被使用数百次),而且这些函数中的局部工作变量(字符串、数组等)可能会被调用。可能是问题所在。

I'm aware of object pooling for larger/heavier objects (and we use this to a degree), but I'm looking for techniques that can be applied across the board, especially relating to functions that are called very many times in tight loops.

我知道更大/更重对象的对象池(我们在一定程度上使用它),但我正在寻找可以全面应用的技术,尤其是与在紧密循环中被多次调用的函数相关的技术.

What techniques can I use to reduce the amount of work that the garbage collector must do?

我可以使用哪些技术来减少垃圾收集器必须完成的工作量?

And, perhaps also - what techniques can be employed to identify which objects are being garbage collected the most? (It's a farly large codebase, so comparing snapshots of the heap has not been very fruitful)

而且,也许还有 - 可以采用哪些技术来识别哪些对象被垃圾回收最多?(这是一个非常大的代码库,所以比较堆的快照并不是很有成效)

回答by Mike Samuel

A lot of the things you need to do to minimize GC churn go against what is considered idiomatic JS in most other scenarios, so please keep in mind the context when judging the advice I give.

您需要做的很多事情来最小化 GC 流失,这与大多数其他场景中被认为是惯用的 JS 背道而驰,因此在判断我给出的建议时请记住上下文。

Allocation happens in modern interpreters in several places:

分配发生在现代解释器的几个地方:

  1. When you create an object via newor via literal syntax [...], or {}.
  2. When you concatenate strings.
  3. When you enter a scope that contains function declarations.
  4. When you perform an action that triggers an exception.
  5. When you evaluate a function expression: (function (...) { ... }).
  6. When you perform an operation that coerces to Object like Object(myNumber)or Number.prototype.toString.call(42)
  7. When you call a builtin that does any of these under the hood, like Array.prototype.slice.
  8. When you use argumentsto reflect over the parameter list.
  9. When you split a string or match with a regular expression.
  1. 当您通过new或通过文字语法创建对象时[...],或{}.
  2. 当您连接字符串时。
  3. 当您输入包含函数声明的范围时。
  4. 当您执行触发异常的操作时。
  5. 对函数表达式求值时:(function (...) { ... }).
  6. 当您执行强制对象的操作时,例如Object(myNumber)Number.prototype.toString.call(42)
  7. 当你调用一个在引擎盖下执行任何这些的内置函数时,比如Array.prototype.slice.
  8. 当你使用arguments反射过参数列表时。
  9. 当您拆分字符串或与正则表达式匹配时。

Avoid doing those, and pool and reuse objects where possible.

避免这样做,并在可能的情况下池化和重用对象。

Specifically, look out for opportunities to:

具体来说,寻找机会:

  1. Pull inner functions that have no or few dependencies on closed-over state out into a higher, longer-lived scope. (Some code minifiers like Closure compilercan inline inner functions and might improve your GC performance.)
  2. Avoid using strings to represent structured data or for dynamic addressing. Especially avoid repeatedly parsing using splitor regular expression matches since each requires multiple object allocations. This frequently happens with keys into lookup tables and dynamic DOM node IDs. For example, lookupTable['foo-' + x]and document.getElementById('foo-' + x)both involve an allocation since there is a string concatenation. Often you can attach keys to long-lived objects instead of re-concatenating. Depending on the browsers you need to support, you might be able to use Mapto use objects as keys directly.
  3. Avoid catching exceptions on normal code-paths. Instead of try { op(x) } catch (e) { ... }, do if (!opCouldFailOn(x)) { op(x); } else { ... }.
  4. When you can't avoid creating strings, e.g. to pass a message to a server, use a builtin like JSON.stringifywhich uses an internal native buffer to accumulate content instead of allocating multiple objects.
  5. Avoid using callbacks for high-frequency events, and where you can, pass as a callback a long-lived function (see 1) that recreates state from the message content.
  6. Avoid using argumentssince functions that use that have to create an array-like object when called.
  1. 将不依赖或几乎不依赖于封闭状态的内部函数拉到更高、寿命更长的范围中。(像Closure 编译器这样的一些代码压缩可以内联内部函数,并可能提高你的 GC 性能。)
  2. 避免使用字符串来表示结构化数据或用于动态寻址。特别避免重复解析 usingsplit或正则表达式匹配,因为每个都需要多个对象分配。这经常发生在查找表和动态 DOM 节点 ID 的键中。例如,lookupTable['foo-' + x]并且document.getElementById('foo-' + x)都因为有一个字符串连接涉及的分配。通常,您可以将键附加到长期存在的对象上,而不是重新连接。根据您需要支持的浏览器,您或许可以Map直接使用对象作为键。
  3. 避免在正常代码路径上捕获异常。而不是try { op(x) } catch (e) { ... },做if (!opCouldFailOn(x)) { op(x); } else { ... }
  4. 当您无法避免创建字符串时,例如将消息传递给服务器,请使用类似JSON.stringify使用内部本机缓冲区来累积内容而不是分配多个对象的内置函数。
  5. 避免对高频事件使用回调,并且在可以的情况下,将长期存在的函数(参见 1)作为回调传递,该函数从消息内容重新创建状态。
  6. 避免使用arguments必须在调用时创建类数组对象的函数。

I suggested using JSON.stringifyto create outgoing network messages. Parsing input messages using JSON.parseobviously involves allocation, and lots of it for large messages. If you can represent your incoming messages as arrays of primitives, then you can save a lot of allocations. The only other builtin around which you can build a parser that does not allocate is String.prototype.charCodeAt. A parser for a complex format that only uses that is going to be hellish to read though.

我建议使用JSON.stringify来创建传出网络消息。使用解析输入消息JSON.parse显然涉及分配,其中很多是用于大消息。如果您可以将传入的消息表示为原语数组,那么您可以节省大量分配。您可以围绕它构建不分配的解析器的唯一其他内置函数是String.prototype.charCodeAt. 一个复杂格式的解析器,只使用它,虽然读起来会很糟糕。

回答by Gene

The Chrome developer tools have a very nice feature for tracing memory allocation. It's called the Memory Timeline. This articledescribes some details. I suppose this is what you're talking about re the "sawtooth"? This is normal behavior for most GC'ed runtimes. Allocation proceeds until a usage threshold is reached triggering a collection. Normally there are different kinds of collections at different thresholds.

Chrome 开发人员工具有一个非常好的功能来跟踪内存分配。它被称为记忆时间轴。 这篇文章描述了一些细节。我想这就是你所说的“锯齿”?这是大多数 GC 运行时的正常行为。分配继续进行,直到达到触发收集的使用阈值。通常在不同的阈值有不同种类的集合。

Memory Timeline in Chrome

Chrome 中的内存时间轴

Garbage collections are included in the event list associated with the trace along with their duration. On my rather old notebook, ephemeral collections are occurring at about 4Mb and take 30ms. This is 2 of your 60Hz loop iterations. If this is an animation, 30ms collections are probably causing stutter. You should start here to see what's going on in your environment: where the collection threshold is and how long your collections are taking. This gives you a reference point to assess optimizations. But you probably won't do better than to decrease the frequency of the stutter by slowing the allocation rate, lengthening the interval between collections.

垃圾收集及其持续时间包含在与跟踪关联的事件列表中。在我相当旧的笔记本上,临时收集发生在大约 4Mb 并且需要 30 毫秒。这是 60Hz 循环迭代中的 2 次。如果这是动画,则 30 毫秒的集合可能会导致卡顿。您应该从这里开始查看您的环境中发生了什么:收集阈值在哪里以及您的收集需要多长时间。这为您提供了评估优化的参考点。但是您可能不会比通过减慢分配速率、延长集合之间的间隔来减少卡顿的频率做得更好。

The next step is to use the Profiles | Record Heap Allocations feature to generate a catalog of allocations by record type. This will quickly show which object types are consuming the most memory during the trace period, which is equivalent to allocation rate. Focus on these in descending order of rate.

下一步是使用 Profiles | 记录堆分配功能可按记录类型生成分配目录。这将快速显示跟踪期间哪些对象类型消耗的内存最多,这相当于分配率。按比率降序关注这些。

The techniques are not rocket science. Avoid boxed objects when you can do with an unboxed one. Use global variables to hold and reuse single boxed objects rather than allocating fresh ones in each iteration. Pool common object types in free lists rather than abandoning them. Cache string concatenation results that are likely reusable in future iterations. Avoid allocation just to return function results by setting variables in an enclosing scope instead. You will have to consider each object type in its own context to find the best strategy. If you need help with specifics, post an edit describing details of the challenge you're looking at.

这些技术不是火箭科学。当您可以使用未装箱的物品时,请避免装箱的物品。使用全局变量来保存和重用单个装箱对象,而不是在每次迭代中分配新的对象。在空闲列表中汇集公共对象类型而不是放弃它们。缓存字符串连接结果,可能会在未来的迭代中重用。通过在封闭范围内设置变量来避免分配只是为了返回函数结果。您必须在其自己的上下文中考虑每种对象类型才能找到最佳策略。如果您需要有关具体问题的帮助,请发布描述您正在查看的挑战的详细信息的编辑。

I advise against perverting your normal coding style throughout an application in a shotgun attempt to produce less garbage. This is for the same reason you should not optimize for speed prematurely. Most of your effort plus much of the added complexity and obscurity of code will be meaningless.

我建议不要在整个应用程序中歪曲您的正常编码风格,以试图产生更少的垃圾。这与您不应过早优化速度的原因相同。您的大部分努力加上代码的大部分增加的复杂性和晦涩性都将毫无意义。

回答by Chris B

As a general principle you'd want to cache as much as possible and do as little creating and destroying for each run of your loop.

作为一般原则,您希望尽可能多地缓存,并尽可能少地为每次循环运行创建和销毁。

The first thing that pops in my head is to reduce the use of anonymous functions (if you have any) inside your main loop. Also it'd be easy to fall into the trap of creating and destroying objects that are passed into other functions. I'm by no means a javascript expert, but I would imagine that this:

我脑海中浮现的第一件事是减少在主循环中使用匿名函数(如果有的话)。此外,很容易陷入创建和销毁传递给其他函数的对象的陷阱。我绝不是 javascript 专家,但我可以想象:

var options = {var1: value1, var2: value2, ChangingVariable: value3};
function loopfunc()
{
    //do something
}

while(true)
{
    $.each(listofthings, loopfunc);

    options.ChangingVariable = newvalue;
    someOtherFunction(options);
}

would run much faster than this:

运行速度会比这快得多:

while(true)
{
    $.each(listofthings, function(){
        //do something on the list
    });

    someOtherFunction({
        var1: value1,
        var2: value2,
        ChangingVariable: newvalue
    });
}

Is there ever any downtime for your program? Maybe you need it to run smoothly for a second or two (e.g. for an animation) and then it has more time to process? If this is the case I could see taking objects that would normally be garbage collected throughout the animation and keeping a reference to them in some global object. Then when the animation ends you can clear all the references and let the garbage collector do it's work.

你的程序有停机时间吗?也许您需要它平稳运行一两秒钟(例如动画),然后它有更多的时间来处理?如果是这种情况,我可以看到在整个动画中通常会被垃圾收集的对象,并在某个全局对象中保留对它们的引用。然后当动画结束时,您可以清除所有引用并让垃圾收集器完成它的工作。

Sorry if this is all a bit trivial compared to what you've already tried and thought of.

很抱歉,与您已经尝试和想到的相比,这有点微不足道。

回答by Mahdi

I'd make one or few objects in the global scope(where I'm sure garbage collector is not allowed to touch them), then I'd try to refactor my solution to use those objects to get the job done, instead of using local variables.

我会在其中创建一个或几个对象global scope(我确定不允许垃圾收集器接触它们),然后我会尝试重构我的解决方案以使用这些对象来完成工作,而不是使用局部变量.

Of course it couldn't be done everywhere in the code, but generally that's my way to avoid garbage collector.

当然,它不能在代码的任何地方都完成,但通常这是我避免垃圾收集器的方法。

P.S. It might make that specific part of code a little bit less maintainable.

PS 这可能会使代码的特定部分不太容易维护。