C# 什么时候结构是答案?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/597259/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 09:42:50  来源:igfitidea点击:

When are structs the answer?

c#performancestructraytracingstruct-vs-class

提问by JulianR

I'm doing a raytracer hobby project, and originally I was using structs for my Vector and Ray objects, and I thought a raytracer was the perfect situation to use them: you create millions of them, they don't live longer than a single method, they're lightweight. However, by simply changing 'struct' to 'class' on Vector and Ray, I got a very significant performance gain.

我正在做一个 raytracer 爱​​好项目,最初我为我的 Vector 和 Ray 对象使用结构,我认为光线追踪器是使用它们的完美情况:你创建了数百万个,它们的寿命不会超过一个方法,它们是轻量级的。但是,通过简单地将 Vector 和 Ray 上的“struct”更改为“class”,我获得了非常显着的性能提升。

What gives? They're both small (3 floats for Vector, 2 Vectors for a Ray), don't get copied around excessively. I do pass them to methods when needed of course, but that's inevitable. So what are the common pitfalls that kill performance when using structs? I've read thisMSDN article that says the following:

是什么赋予了?它们都很小(向量为 3 个浮点数,光线为 2 个向量),不要过度复制。当然,我确实在需要时将它们传递给方法,但这是不可避免的。那么,在使用结构体时,哪些会导致性能下降的常见陷阱是什么?我读过这篇MSDN 文章,内容如下:

When you run this example, you'll see that the struct loop is orders of magnitude faster. However, it is important to beware of using ValueTypes when you treat them like objects. This adds extra boxing and unboxing overhead to your program, and can end up costing you more than it would if you had stuck with objects! To see this in action, modify the code above to use an array of foos and bars. You'll find that the performance is more or less equal.

运行此示例时,您会看到结构体循环的速度快了几个数量级。但是,当您将它们视为对象时,请务必注意不要使用 ValueType。这会为您的程序增加额外的装箱和拆箱开销,并且最终可能比您坚持使用对象时花费更多!要查看此操作,请修改上面的代码以使用 foos 和 bar 数组。您会发现性能或多或少是相同的。

It's however quite old (2001) and the whole "putting them in an array causes boxing/unboxing" struck me as odd. Is that true? However, I did pre-calculate the primary rays and put them in an array, so I took up on this article and calculated the primary ray when I needed it and never added them to an array, but it didn't change anything: with classes, it was still 1.5x faster.

然而,它已经很老了(2001 年),整个“将它们放在一个数组中会导致装箱/拆箱”让我觉得很奇怪。真的吗?但是,我确实预先计算了主光线并将它们放在一个数组中,所以我开始阅读这篇文章,并在需要时计算了主光线,并且从未将它们添加到数组中,但它没有改变任何东西:类,它仍然快 1.5 倍。

I am running .NET 3.5 SP1 which I believe fixed an issue where struct methods weren't ever in-lined, so that can't be it either.

我正在运行 .NET 3.5 SP1,我相信它解决了一个问题,即结构方法从未内联过,所以也不能这样。

So basically: any tips, things to consider and what to avoid?

所以基本上:任何提示,要考虑的事情以及要避免的事情?

EDIT: As suggested in some answers, I've set up a test project where I've tried passing structs as ref. The methods for adding two Vectors:

编辑:正如一些答案中所建议的,我已经建立了一个测试项目,我尝试将结构作为参考传递。两个向量相加的方法:

public static VectorStruct Add(VectorStruct v1, VectorStruct v2)
{
  return new VectorStruct(v1.X + v2.X, v1.Y + v2.Y, v1.Z + v2.Z);
}

public static VectorStruct Add(ref VectorStruct v1, ref VectorStruct v2)
{
  return new VectorStruct(v1.X + v2.X, v1.Y + v2.Y, v1.Z + v2.Z);
}

public static void Add(ref VectorStruct v1, ref VectorStruct v2, out VectorStruct v3)
{
  v3 = new VectorStruct(v1.X + v2.X, v1.Y + v2.Y, v1.Z + v2.Z);
}

For each I got a variation of the following benchmark method:

对于每个我得到了以下基准方法的变体:

VectorStruct StructTest()
{
  Stopwatch sw = new Stopwatch();
  sw.Start();
  var v2 = new VectorStruct(0, 0, 0);
  for (int i = 0; i < 100000000; i++)
  {
    var v0 = new VectorStruct(i, i, i);
    var v1 = new VectorStruct(i, i, i);
    v2 = VectorStruct.Add(ref v0, ref v1);
  }
  sw.Stop();
  Console.WriteLine(sw.Elapsed.ToString());
  return v2; // To make sure v2 doesn't get optimized away because it's unused. 
}

All seem to perform pretty much identical. Is it possible that they get optimized by the JIT to whatever is the optimal way to pass this struct?

所有似乎都表现得几乎相同。它们是否有可能被 JIT 优化为传递此结构的最佳方式?

EDIT2: I must note by the way that using structs in my test project isabout 50% faster than using a class. Why this is different for my raytracer I don't know.

EDIT2:我必须在我的测试项目中使用结构的方式注意约50%,比使用类更快。我不知道为什么我的光线追踪器会有所不同。

采纳答案by TraumaPony

Basically, don't make them too big, and pass them around by ref when you can. I discovered this the exact same way... By changing my Vector and Ray classes to structs.

基本上,不要让它们太大,并尽可能通过 ref 传递它们。我以完全相同的方式发现了这一点......通过将我的 Vector 和 Ray 类更改为结构体。

With more memory being passed around, it's bound to cause cache thrashing.

随着更多的内存被传递,它必然会导致缓存抖动。

回答by Erik Forbes

Anything written regarding boxing/unboxing prior to .NET generics can be taken with something of a grain of salt. Generic collection types have removed the need for boxing and unboxing of value types, which makes using structs in these situations more valuable.

在 .NET 泛型之前编写的任何关于装箱/拆箱的内容都可以接受。泛型集合类型消除了对值类型的装箱和拆箱的需要,这使得在这些情况下使用结构更有价值。

As for your specific slowdown - we'd probably need to see some code.

至于您的具体减速 - 我们可能需要查看一些代码。

回答by Andrew Hare

I think the key lies in these two statements from your post:

我认为关键在于您帖子中的这两个陈述:

you create millions of them

你创造了数百万

and

I do pass them to methods when needed of course

当然,我确实在需要时将它们传递给方法

Now unless your struct is less than or equal to 4 bytes in size (or 8 bytes if you are on a 64-bit system) you are copying much more on each method call then if you simply passed an object reference.

现在,除非您的结构小于或等于 4 个字节(如果您在 64 位系统上,则为 8 个字节),否则您将在每个方法调用中复制更多,如果您只是传递一个对象引用。

回答by BozoJoe

You can also make structs into Nullable objects. Custom classes will not be able to created

您还可以将结构变成 Nullable 对象。自定义类将无法创建

as

作为

Nullable<MyCustomClass> xxx = new Nullable<MyCustomClass>

where with a struct is nullable

其中结构可以为空

Nullable<MyCustomStruct> xxx = new Nullable<MyCustomStruct>

But you will be (obviously) losing all your inheritance features

但是您将(显然)失去所有继承功能

回答by JaredPar

Have you profiled the application? Profiling is the only sure fire way to see where the actual performance problem is. There are operations that are generally better/worse on structs but unless you profile you'd just be guessing as to what the problem is.

您是否对应用程序进行了概要分析?分析是查看实际性能问题的唯一可靠方法。有些操作通常在结构上更好/更糟,但除非您进行概要分析,否则您只会猜测问题是什么。

回答by JaredPar

The first thing I would look for is to make sure that you have explicitly implemented Equals and GetHashCode. Failing to do this means that the runtime implementation of each of these does some very expensive operations to compare two struct instances (internally it uses reflection to determine each of the private fields and then checkes them for equality, this causes a significant amount of allocation).

我要寻找的第一件事是确保您已经明确实现了 Equals 和 GetHashCode。不这样做意味着每个这些的运行时实现都会执行一些非常昂贵的操作来比较两个 struct 实例(在内部它使用反射来确定每个私有字段,然后检查它们是否相等,这会导致大量的分配) .

Generally though, the best thing you can do is to run your code under a profiler and see where the slow parts are. It can be an eye-opening experience.

通常,您能做的最好的事情是在分析器下运行您的代码,看看慢的部分在哪里。这可能是一次令人大开眼界的体验。

回答by Guffa

In the recommendations for when to use a struct it says that it should not be larger than 16 bytes. Your Vector is 12 bytes, which is close to the limit. The Ray has two Vectors, putting it at 24 bytes, which is clearly over the recommended limit.

在关于何时使用结构的建议中,它说它不应大于 16 个字节。你的 Vector 是 12 字节,接近限制。Ray 有两个向量,将其放置在 24 字节,这显然超出了推荐的限制。

When a struct gets larger than 16 bytes it can no longer be copied efficiently with a single set of instructions, instead a loop is used. So, by passing this "magic" limit, you are actually doing a lot more work when you pass a struct than when you pass a reference to an object. This is why the code is faster with classes eventhough there is more overhead when allocating the objects.

当结构大于 16 字节时,它不能再用一组指令有效地复制,而是使用循环。所以,通过传递这个“神奇”的限制,当你传递一个结构体时,你实际上比传递一个对象的引用时做了更多的工作。这就是为什么代码在类中更快,尽管在分配对象时有更多的开销。

The Vector could still be a struct, but the Ray is simply too large to work well as a struct.

Vector 仍然可以是一个结构体,但 Ray 实在是太大了,不能作为一个结构体很好地工作。

回答by Instance Hunter

I use structs basically for parameter objects, returning multiple pieces of information from a function, and... nothing else. Don't know whether it's "right" or "wrong," but that's what I do.

我基本上将结构用于参数对象,从函数返回多条信息,……没有别的。不知道是“对”还是“错”,但这就是我所做的。

回答by ILoveFortran

An array of structs would be a single contiguous structure in memory, while items in an array of objects (instances of reference types) need to be individually addressed by a pointer (i.e. a reference to an object on the garbage-collected heap). Therefore if you address large collections of items at once, structs will give you a performance gain since they need fewer indirections. In addition, structs cannot be inherited, which might allow the compiler to make additional optimizations (but that is just a possibility and depends on the compiler).

结构数组将是内存中的单个连续结构,而对象数组(引用类型的实例)中的项需要由指针(即对垃圾收集堆上的对象的引用)单独寻址。因此,如果您一次处理大量项目,结构将为您带来性能提升,因为它们需要更少的间接访问。此外,结构不能被继承,这可能允许编译器进行额外的优化(但这只是一种可能性,取决于编译器)。

However, structs have quite different assignment semantics and also cannot be inherited. Therefore I would usually avoid structs except for the given performance reasons when needed.

但是,结构体具有完全不同的赋值语义并且也不能被继承。因此,除了给定的性能原因之外,我通常会在需要时避免使用结构体。



struct

结构

An array of values v encoded by a struct (value type) looks like this in memory:

由结构(值类型)编码的值数组 v 在内存中如下所示:

vvvv

呜呜呜

class

班级

An array of values v encoded by a class (reference type) look like this:

由类(引用类型)编码的值数组 v 如下所示:

pppp

购买力

..v..v...v.v..

..v..v...vv.

where p are the this pointers, or references, which point to the actual values v on the heap. The dots indicate other objects that may be interspersed on the heap. In the case of reference types you need to reference v via the corresponding p, in the case of value types you can get the value directly via its offset in the array.

其中 p 是指向堆上实际值 v 的 this 指针或引用。点表示可能散布在堆上的其他对象。在引用类型的情况下,您需要通过相应的 p 引用 v,在值类型的情况下,您可以通过其在数组中的偏移量直接获取值。

回答by Grant Peters

If the structs are small, and not too many exist at once, it SHOULD be placing them on the stack (as long as its a local variable and not a member of a class) and not on the heap, this means the GC doesn't need to be invoked and memory allocation/deallocation should be almost instantaneous.

如果结构很小,并且一次没有太多,它应该将它们放在堆栈上(只要它是局部变量而不是类的成员)而不是放在堆上,这意味着 GC 不t 需要被调用并且内存分配/释放应该几乎是瞬时的。

When passing a struct as a parameter to function, the struct is copied, which not only means more allocations/deallocations (from the stack, which is almost instantaneous, but still has overhead), but the overhead in just transferring data between the 2 copies. If you pass via reference, this is a non issue as you are just telling it where to read the data from, rather than copying it.

当将结构体作为参数传递给函数时,结构体被复制,这不仅意味着更多的分配/释放(来自堆栈,这几乎是瞬时的,但仍然有开销),而且只是在两个副本之间传输数据的开销. 如果您通过引用传递,这不是问题,因为您只是告诉它从哪里读取数据,而不是复制它。

I'm not 100% sure on this, but i suspect that returning arrays via an 'out' parameter may also give you a speed boost, as memory on the stack is reserved for it and doesn't need to be copied as the stack is "unwound" at the end of function calls.

我对此不是 100% 确定,但我怀疑通过“out”参数返回数组也可以提高速度,因为堆栈上的内存是为它保留的,不需要作为堆栈复制在函数调用结束时“展开”。