JavaScript 中类型化数组的优点是它们在 C 中的工作方式相同还是相似?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13328658/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-26 18:27:18  来源:igfitidea点击:

Are the advantages of Typed Arrays in JavaScript is that they work the same or similar in C?

javascriptcarraysmemorytyped-arrays

提问by alex

I've been playing around with Typed Arraysin JavaScript.

我一直在玩JavaScript 中的类型化数组

var buffer = new ArrayBuffer(16);
var int32View = new Int32Array(buffer);

I imagine normal arrays ([1, 257, true])in JavaScript have poor performance because their values could be of any type, therefore, reaching an offset in memory is not trivial.

我想普通数组([1, 257, true])在 JavaScript 中的性能很差,因为它们的值可以是任何类型,因此,在内存中达到偏移量并非易事。

I originally thought that JavaScript array subscripts worked the same as objects (as they have many similarities), and were hash mapbased, requiring a hash based lookup. But I haven't found much credible information to confirm this.

我最初认为 JavaScript 数组下标与对象的工作方式相同(因为它们有很多相似之处),并且是基于哈希映射的,需要基于哈希的查找。但我还没有找到太多可信的信息来证实这一点。

So, I'd assume the reason why Typed Arrays perform so well is because they work like normal arrays in C, where they're always typed. Given the initial code example above, and wishing to get the 10th value in the typed array...

所以,我认为 Typed Arrays 表现如此出色的原因是因为它们在 C 中像普通数组一样工作,它们总是被键入。鉴于上面的初始代码示例,并希望获得类型化数组中的第 10 个值...

var value = int32View[10];
  • The type is Int32, so each value must consist of 32bits or 4bytes.
  • The subscript is 10.
  • So the location in memory of that value is <array offset> + (4 * 10), and then read 4bytes to get the total value.
  • 类型是Int32,因此每个值必须由32位或4字节组成。
  • 下标是10
  • 所以那个值在内存中的位置是<array offset> + (4 * 10),然后读取4字节得到总值。

I basically just want to confirm my assumptions. Is my thoughts around this correct, and if not, please elaborate.

我基本上只是想确认我的假设。我对此的想法是否正确,如果不正确,请详细说明。

I checked out the V8 sourceto see if I could answer it myself, but my C is rusty and I'm not too familiar with C++.

我查看了V8 源代码,看看我是否可以自己回答,但是我的 C 很生疏,而且我对 C++ 不太熟悉。

回答by AshleysBrain

Typed Arrays were designed by the WebGL standards committee, for performance reasons. Typically Javascript arrays are generic and can hold objects, other arrays and so on - and the elements are not necessarily sequential in memory, like they would be in C. WebGL requires buffers to be sequential in memory, because that's how the underlying C API expects them. If Typed Arrays are not used, passing an ordinary array to a WebGL function requires a lot of work: each element must be inspected, the type checked, and if it's the right thing (e.g. a float) then copy it out to a separate sequential C-like buffer, then pass that sequential buffer to the C API. Ouch - lots of work! For performance-sensitive WebGL applications this could cause a big drop in the framerate.

出于性能原因,类型数组由 WebGL 标准委员会设计。通常 Javascript 数组是通用的,可以保存对象、其他数组等等 - 并且元素在内存中不一定是顺序的,就像它们在 C 中一样。WebGL 要求缓冲区在内存中是顺序的,因为这是底层 C API 所期望的他们。如果不使用类型化数组,将普通数组传递给 WebGL 函数需要大量工作:必须检查每个元素,检查类型,如果它是正确的(例如浮点数),则将其复制到单独的序列中类似 C 的缓冲区,然后将该顺序缓冲区传递给 C API。哎哟 - 很多工作!对于性能敏感的 WebGL 应用程序,这可能会导致帧率大幅下降。

On the other hand, like you suggest in the question, Typed Arrays use a sequential C-like buffer already in their behind-the-scenes storage. When you write to a typed array, you are indeed assigning to a C-like array behind the scenes. For the purposes of WebGL, this means the buffer can be used directly by the corresponding C API.

另一方面,就像您在问题中所建议的那样,类型化数组使用已经在其幕后存储中的类似 C 的顺序缓冲区。当您写入类型化数组时,您确实在幕后分配了一个类似 C 的数组。就 WebGL 而言,这意味着缓冲区可以由相应的 C API 直接使用。

Note your memory address calculation isn't quite enough: the browser mustalso bounds-check the array, to prevent out-of-range accesses. This has to happen with any kind of Javascript array, but in many cases clever Javascript engines can omit the check when it can prove the index value is already within bounds (such as looping from 0 to the length of the array). It also has to check the array index is really a number and not a string or something else! But it is in essence like you describe, using C-like addressing.

请注意,您的内存地址计算还不够:浏览器还必须对数组进行边界检查,以防止超出范围的访问。这必须发生在任何类型的 Javascript 数组中,但在许多情况下,聪明的 Javascript 引擎可以在证明索引值已经在边界内时省略检查(例如从 0 循环到数组的长度)。它还必须检查数组索引是否真的是一个数字而不是字符串或其他东西!但它本质上就像你描述的那样,使用类 C 寻址。

BUT...that's not all! In some cases clever Javascript engines can also deduce the type of ordinary Javascript arrays. In an engine like V8, if you make an ordinary Javascript array and only store floats in it, V8 may optimistically decide it's an array of floats and optimise the code it generates for that. The performance can then be equivalent to typed arrays. So typed arrays aren't actually necessary to reach maximum performance: just use arrays predictably (with every element the same type) and some engines can optimise for that as well.

但是……这还不是全部!在某些情况下,聪明的 Javascript 引擎也可以推断出普通 Javascript 数组的类型。在像 V8 这样的引擎中,如果你创建一个普通的 Javascript 数组并且只在其中存储浮点数,V8 可能会乐观地决定它是一个浮点数数组并优化它为此生成的代码。性能可以等同于类型化数组。所以类型化数组实际上并不是达到最大性能所必需的:只需可预测地使用数组(每个元素都具有相同的类型),一些引擎也可以为此进行优化。

So why do typed arrays still need to exist?

那么为什么类型化数组仍然需要存在呢?

  • Optimisations like deducing the type of arrays is really complicated. If V8 deduces an ordinary array has only floats in it, then you store an object in an element, it has to de-optimiseand regenerate code that makes the array generic again. It's quite an achievement that all this works transparently. Typed Arrays are much simpler: they're guaranteed to be one type, and you just can't store other things like objects in them.
  • Optimisations are never guaranteed to happen; you may store only floats in an ordinary array, but the engine may decide for various reasons not to optimise it.
  • The fact they're much simpler means other less-sophisticated javascript engines can easily implement them. They don't need all the advanced deoptimisation support.
  • Even with really advanced engines, proving optimisations can be used is extremely difficult and can sometimes be impossible. A typed array significantly simplifies the level of proof the engine needs to be able to optimise around it. A value returned from a typed array is certainly of a certain type, and engines can optimise for the result being that type. A value returned from an ordinary array could in theory have any type, and the engine may not be able to prove it will always have the same type result, and therefore generates less efficient code. Therefore code around a typed array is more easily optimised.
  • Typed arrays remove the opportunity to make a mistake. You just can't accidentally store an object and suddenly get far worse performance.
  • 推导数组类型之类的优化非常复杂。如果 V8 推导出一个普通数组只有浮点数,那么你将一个对象存储在一个元素中,它必须去优化并重新生成代码,使数组再次通用。所有这些都透明地运作,这是一项了不起的成就。类型数组要简单得多:它们保证是一种类型,并且您不能在其中存储其他东西,例如对象。
  • 优化永远不能保证发生;您可能只将浮点数存储在普通数组中,但引擎可能会出于各种原因决定不对其进行优化。
  • 它们更简单的事实意味着其他不太复杂的 javascript 引擎可以轻松实现它们。他们不需要所有高级去优化支持。
  • 即使使用非常先进的引擎,证明可以使用优化也非常困难,有时甚至是不可能的。类型化数组显着简化了引擎需要能够围绕它进行优化的证明级别。从类型化数组返回的值肯定是某种类型,引擎可以针对该类型的结果进行优化。从普通数组返回的值理论上可以有任何类型,引擎可能无法证明它总是具有相同的类型结果,因此生成的代码效率较低。因此,围绕类型化数组的代码更容易优化。
  • 类型化数组消除了犯错的机会。你不能不小心存储一个对象而突然变得更糟糕的性能。

So, in short, ordinary arrays can in theory be equally fast as typed arrays. But typed arrays make it much easier to reach peak performance.

所以,简而言之,普通数组理论上可以和类型化数组一样快。但是类型化数组可以更容易地达到峰值性能。

回答by reece

Yes, you are mostly correct. With a standard JavaScript array, the JavaScript engine has to assume that the data in the array is all objects. It can still store this as a C-like array/vector, where the access to the memory is still like you described. The problem is that the data is not the value, but something referencing that value (the object).

是的,你大部分是正确的。对于标准的 JavaScript 数组,JavaScript 引擎必须假设数组中的数据都是对象。它仍然可以将其存储为类似 C 的数组/向量,其中对内存的访问仍然像您描述的那样。问题在于数据不是值,而是引用该值(对象)的东西。

So, performing a[i] = b[i] + 2requires the engine to:

因此,执行a[i] = b[i] + 2需要引擎:

  1. access the object in b at index i;
  2. check what type the object is;
  3. extract the value out of the object;
  4. add 2 to the value;
  5. create a new object with the newly computed value from 4;
  6. assign the new object from step 5 into a at index i.
  1. 访问 b 中索引 i 处的对象;
  2. 检查对象是什么类型;
  3. 从对象中提取值;
  4. 将值加 2;
  5. 使用新计算的 4 值创建一个新对象;
  6. 将步骤 5 中的新对象分配到索引 i 处的 a。

With a typed array, the engine can:

使用类型化数组,引擎可以:

  1. access the value in b at index i (including placing it in a CPU register);
  2. increment the value by 2;
  3. assign the new object from step 2 into a at index i.
  1. 访问 b 中索引 i 处的值(包括将其放入 CPU 寄存器中);
  2. 将值增加 2;
  3. 将步骤 2 中的新对象分配到索引 i 处的 a。

NOTE: These are not the exact steps a JavaScript engine will perform, as that depends on the code being compiled (including surrounding code) and the engine in question.

注意:这些不是 JavaScript 引擎将执行的确切步骤,因为这取决于正在编译的代码(包括周围代码)和相关引擎。

This allows the resulting computations to be much more efficient. Also, the typed arrays have a memory layout guarantee (arrays of n-byte values) and can thus be used to directly interface with data (audio, video, etc.).

这使得结果计算更加高效。此外,类型化数组具有内存布局保证(n 字节值的数组),因此可用于直接与数据(音频、视频等)交互。

回答by David Leppik

When it comes to performance, things can change fast. As AshleysBrain says, it comes down to whether the VM can deduce that a normal array can be implemented as a typed array quickly and accurately. That depends on the particular optimizations of the particular JavaScript VM, and it can change in any new browser version.

在性能方面,事情可能会发生快速变化。正如 AshleysBrain 所说,这归结为 VM 是否可以推断出普通数组可以快速准确地实现为类型化数组。这取决于特定 JavaScript VM 的特定优化,并且可以在任何新浏览器版本中更改。

This Chrome developer commentprovides some guidance that worked as of June 2012:

此 Chrome 开发人员评论提供了一些截至 2012 年 6 月有效的指导:

  1. Normal arrays can be as fast as typed arrays if you do a lot of sequential access. Random access outside the bounds of the array causes the array to grow.
  2. Typed arrays are fast for access, but slow to be allocated. If you create temporary arrays frequently, avoid typed arrays. (Fixing this is possible, but it's low priority.)
  3. Micro-benchmarks such as JSPerf are not reliable for real-world performance.
  1. 如果您进行大量顺序访问,普通数组可以与类型化数组一样快。数组边界外的随机访问会导致数组增长。
  2. 类型化数组访问速度快,但分配速度慢。如果您经常创建临时数组,请避免使用类型化数组。(修复此问题是可能的,但优先级较低。)
  3. 诸如 JSPerf 之类的微型基准测试对于实际性能来说并不可靠。

If I might elaborate on the last point, I've seen this phenomenon with Java for years. When you test the speed of a small piece of code by running it over and over again in isolation, the VM optimizes the heck out of it. It makes optimizations which only make sense for that specific test. Your benchmark can get a hundredfold speed improvement compared to running the same code inside another program, or compared to running it immediately after running several different tests that optimize the same code differently.

如果我可以详细说明最后一点,我多年来一直在 Java 中看到这种现象。当您通过孤立地一遍又一遍地运行一小段代码来测试它的速度时,VM 会对其进行优化。它进行了仅对特定测试有意义的优化。与在另一个程序中运行相同的代码相比,或者与运行几个不同的测试以不同的方式优化相同的代码后立即运行它相比,您的基准测试可以获得一百倍的速度提升。

回答by Farid Nouri Neshat

I'm not really contributor to any javascript engine, only had some readings on v8, so my answer might not be completely true:

我不是任何 javascript 引擎的真正贡献者,只有一些关于 v8 的阅读资料,所以我的回答可能不完全正确:

Well values in arrays(only normal arrays with no holes/gaps, not sparse. Sparse arrays are treated as objects.) are all either pointers or a number with a fixed length(in v8 they are 32 bit, if a 31 bit integer then it's tagged with a 0bit in the end, else it's a pointer).

数组中的值(只有没有孔/间隙的普通数组,不是稀疏的。稀疏数组被视为对象。)都是指针或具有固定长度的数字(在 v8 中它们是 32 位,如果是 31 位整数,则它0在最后标记了一点,否则它是一个指针)。

So I don't think finding the memory location is any different than a typedArray, since the number of the bytes are the same all over the array. But the difference comes that if it's an a object, then you have to add one unboxing layer, which doesn't happen for normal typedArrays.

所以我不认为找到内存位置与 typedArray 有什么不同,因为整个数组的字节数是相同的。但不同之处在于,如果它是一个对象,那么您必须添加一个拆箱层,这对于普通的 typedArrays 不会发生。

And ofcourse when accessing typedArrays, definitely doesn't have type checking's that a normal array have(though that might be remove in a higly optimized code, which is only generated for hot code).

当然,在访问 typedArrays 时,绝对没有普通数组所具有的类型检查(尽管可能会在高度优化的代码中删除,该代码仅为热代码生成)。

For Writing, if it's the same type shouldn't be much slower. If it's a different type then the JS engine might generate polymorphic code for it, which is slower.

对于写作,如果它是相同的类型应该不会慢很多。如果它是不同的类型,那么 JS 引擎可能会为它生成多态代码,这会更慢。

You can also try making some benchmarks on jsperf.com to confirm.

你也可以尝试在 jsperf.com 上做一些基准测试来确认。