在 C++ 中正确使用堆栈和堆?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/599308/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Proper stack and heap usage in C++?
提问by Alexander
I've been programming for a while but It's been mostly Java and C#. I've never actually had to manage memory on my own. I recently began programming in C++ and I'm a little confused as to when I should store things on the stack and when to store them on the heap.
我已经编程了一段时间,但主要是 Java 和 C#。我从来没有真正需要自己管理内存。我最近开始用 C++ 编程,我对什么时候应该在堆栈上存储东西以及什么时候将它们存储在堆上有点困惑。
My understanding is that variables which are accessed very frequently should be stored on the stack and objects, rarely used variables, and large data structures should all be stored on the heap. Is this correct or am I incorrect?
我的理解是,经常访问的变量应该存储在堆栈和对象上,很少使用的变量和大型数据结构都应该存储在堆上。这是正确的还是我不正确?
回答by Crashworks
No, the difference between stack and heap isn't performance. It's lifespan: any local variable inside a function (anything you do not malloc() or new) lives on the stack. It goes away when you return from the function. If you want something to live longer than the function that declared it, you must allocate it on the heap.
不,堆栈和堆之间的区别不是性能。它是生命周期:函数内的任何局部变量(任何你没有 malloc() 或 new 的东西)都存在于堆栈中。当您从函数返回时它就会消失。如果你想让某个东西比声明它的函数活得更久,你必须在堆上分配它。
class Thingy;
Thingy* foo( )
{
int a; // this int lives on the stack
Thingy B; // this thingy lives on the stack and will be deleted when we return from foo
Thingy *pointerToB = &B; // this points to an address on the stack
Thingy *pointerToC = new Thingy(); // this makes a Thingy on the heap.
// pointerToC contains its address.
// this is safe: C lives on the heap and outlives foo().
// Whoever you pass this to must remember to delete it!
return pointerToC;
// this is NOT SAFE: B lives on the stack and will be deleted when foo() returns.
// whoever uses this returned pointer will probably cause a crash!
return pointerToB;
}
For a clearer understanding of what the stack is, come at it from the other end -- rather than try to understand what the stack does in terms of a high level language, look up "call stack" and "calling convention" and see what the machine really does when you call a function. Computer memory is just a series of addresses; "heap" and "stack" are inventions of the compiler.
为了更清楚地了解堆栈是什么,从另一端来看它——而不是尝试从高级语言的角度理解堆栈的作用,查找“调用堆栈”和“调用约定”,看看是什么当您调用函数时,机器确实会这样做。计算机内存只是一系列地址;“堆”和“栈”是编译器的发明。
回答by MarkR
I would say:
我会说:
Store it on the stack, if you CAN.
如果可以,请将其存储在堆栈中。
Store it on the heap, if you NEED TO.
如果需要,请将其存储在堆上。
Therefore, prefer the stack to the heap. Some possible reasons that you can't store something on the stack are:
因此,更喜欢堆栈而不是堆。无法在堆栈中存储内容的一些可能原因是:
- It's too big - on multithreaded programs on 32-bit OS, the stack has a small and fixed (at thread-creation time at least) size (typically just a few megs. This is so that you can create lots of threads without exhausting address space. For 64-bit programs, or single threaded (Linux anyway) programs, this is not a major issue. Under 32-bit Linux, single threaded programs usually use dynamic stacks which can keep growing until they reach the top of the heap.
- You need to access it outside the scope of the original stack frame - this is really the main reason.
- 它太大了 - 在 32 位操作系统上的多线程程序中,堆栈的大小很小且固定(至少在线程创建时)(通常只有几兆。这样您就可以创建大量线程而不会耗尽地址空间。对于 64 位程序或单线程(无论如何是 Linux)程序,这不是主要问题。在 32 位 Linux 下,单线程程序通常使用动态堆栈,该堆栈可以不断增长,直到它们到达堆的顶部。
- 您需要在原始堆栈框架的范围之外访问它 - 这确实是主要原因。
It is possible, with sensible compilers, to allocate non-fixed size objects on the heap (usually arrays whose size is not known at compile time).
使用合理的编译器可以在堆上分配非固定大小的对象(通常是在编译时大小未知的数组)。
回答by Daniel Earwicker
It's more subtle than the other answers suggest. There is no absolute divide between data on the stack and data on the heap based on how you declare it. For example:
它比其他答案所暗示的更微妙。根据您的声明方式,堆栈上的数据和堆上的数据之间没有绝对的区别。例如:
std::vector<int> v(10);
In the body of a function, that declares a vector
(dynamic array) of ten integers on the stack. But the storage managed by the vector
is not on the stack.
在函数体中,它vector
在堆栈上声明了一个包含十个整数的(动态数组)。但是由 管理的存储vector
不在堆栈上。
Ah, but (the other answers suggest) the lifetime of that storage is bounded by the lifetime of the vector
itself, which here is stack-based, so it makes no difference how it's implemented - we can only treat it as a stack-based object with value semantics.
啊,但是(其他答案建议)该存储的生命周期受vector
自身生命周期的限制,这里是基于堆栈的,因此它的实现方式没有区别 - 我们只能将其视为基于堆栈的对象具有值语义。
Not so. Suppose the function was:
不是这样。假设函数是:
void GetSomeNumbers(std::vector<int> &result)
{
std::vector<int> v(10);
// fill v with numbers
result.swap(v);
}
So anything with a swap
function (and any complex value type should have one) can serve as a kind of rebindable reference to some heap data, under a system which guarantees a single owner of that data.
因此swap
,在保证该数据的单一所有者的系统下,具有函数的任何事物(以及任何复杂值类型都应该具有函数)可以作为对某些堆数据的一种可重新绑定的引用。
Therefore the modern C++ approach is to neverstore the address of heap data in naked local pointer variables. All heap allocations must be hidden inside classes.
因此,现代 C++ 方法永远不会将堆数据的地址存储在裸局部指针变量中。所有堆分配必须隐藏在类中。
If you do that, you can think of all variables in your program as if they were simple value types, and forget about the heap altogether (except when writing a new value-like wrapper class for some heap data, which ought to be unusual).
如果这样做,您可以将程序中的所有变量视为简单的值类型,而完全忘记堆(除非为某些堆数据编写新的类似值的包装类,这应该是不寻常的) .
You merely have to retain one special bit of knowledge to help you optimise: where possible, instead of assigning one variable to another like this:
您只需要保留一些特殊的知识来帮助您进行优化:在可能的情况下,不要像这样将一个变量分配给另一个变量:
a = b;
swap them like this:
像这样交换它们:
a.swap(b);
because it's much faster and it doesn't throw exceptions. The only requirement is that you don't need b
to continue to hold the same value (it's going to get a
's value instead, which would be trashed in a = b
).
因为它更快,而且不会抛出异常。唯一的要求是您不需要b
继续保持相同的值(它将改为获取a
的值,而该值将被丢弃在 中a = b
)。
The downside is that this approach forces you to return values from functions via output parameters instead of the actual return value. But they're fixing that in C++0x with rvalue references.
缺点是这种方法迫使您通过输出参数而不是实际返回值从函数返回值。但是他们在 C++0x 中使用rvalue 引用修复了这个问题。
In the most complicated situations of all, you would take this idea to the general extreme and use a smart pointer class such as shared_ptr
which is already in tr1. (Although I'd argue that if you seem to need it, you've possibly moved outside Standard C++'s sweet spot of applicability.)
在最复杂的情况下,您会将这种想法发挥到极致,并使用智能指针类,例如shared_ptr
tr1 中已经存在的类。(尽管我认为如果您似乎需要它,您可能已经超出了标准 C++ 的适用性最佳点。)
回答by 1800 INFORMATION
You also would store an item on the heap if it needs to be used outside the scope of the function in which it is created. One idiom used with stack objects is called RAII - this involves using the stack based object as a wrapper for a resource, when the object is destroyed, the resource would be cleaned up. Stack based objects are easier to keep track of when you might be throwing exceptions - you don't need to concern yourself with deleting a heap based object in an exception handler. This is why raw pointers are not normally used in modern C++, you would use a smart pointer which can be a stack based wrapper for a raw pointer to a heap based object.
如果需要在创建它的函数的范围之外使用它,您还可以将它存储在堆上。与堆栈对象一起使用的一个习惯用法称为 RAII - 这涉及使用基于堆栈的对象作为资源的包装器,当对象被销毁时,资源将被清理。基于堆栈的对象更容易跟踪您何时可能抛出异常 - 您无需担心在异常处理程序中删除基于堆的对象。这就是为什么现代 C++ 中通常不使用原始指针的原因,您将使用智能指针,它可以是基于堆栈的包装器,用于指向基于堆的对象的原始指针。
回答by Nick
To add to the other answers, it can also be about performance, at least a little bit. Not that you should worry about this unless it's relevant for you, but:
要添加到其他答案中,它也可能与性能有关,至少有一点。并不是说您应该担心这一点,除非它与您相关,但是:
Allocating in the heap requires finding a tracking a block of memory, which is not a constant-time operation (and takes some cycles and overhead). This can get slower as memory becomes fragmented, and/or you're getting close to using 100% of your address space. On the other hand, stack allocations are constant-time, basically "free" operations.
在堆中分配需要找到跟踪内存块,这不是恒定时间操作(并且需要一些周期和开销)。随着内存变得碎片化,和/或您接近使用 100% 的地址空间,这可能会变慢。另一方面,堆栈分配是恒定时间的,基本上是“免费”操作。
Another thing to consider (again, really only important if it becomes an issue) is that typically the stack size is fixed, and can be much lower than the heap size. So if you're allocating large objects or many small objects, you probably want to use the heap; if you run out of stack space, the runtime will throw the site titular exception. Not usually a big deal, but another thing to consider.
另一件要考虑的事情(同样,只有在它成为问题时才真正重要)是通常堆栈大小是固定的,并且可以远小于堆大小。因此,如果您要分配大对象或许多小对象,您可能希望使用堆;如果堆栈空间用完,运行时将抛出站点名义异常。通常不是什么大问题,但需要考虑的另一件事。
回答by unixman83
Stack is more efficient, and easier to managed scoped data.
堆栈更高效,更容易管理范围内的数据。
But heap should be used for anything larger than a fewKB(it's easy in C++, just create a boost::scoped_ptr
on the stack to hold a pointer to the allocated memory).
但是堆应该用于大于几KB 的任何东西(这在 C++ 中很容易,只需boost::scoped_ptr
在堆栈上创建一个指向已分配内存的指针)。
Consider a recursive algorithm that keeps calling into itself. It's Very hard to limit and or guess the total stack usage! Whereas on the heap, the allocator (malloc()
or new
) can indicate out-of-memory by returning NULL
or throw
ing.
考虑一个不断调用自身的递归算法。很难限制或猜测总堆栈使用量!而在堆上,分配器(malloc()
或new
)可以通过返回NULL
或throw
ing来指示内存不足。
Source: Linux Kernel whose stack is no larger than 8KB!
来源:堆栈不大于 8KB 的 Linux Kernel!
回答by Daniel Daranas
For completeness, you may read Miro Samek's article about the problems of using the heap in the context of embedded software.
为了完整起见,您可以阅读 Miro Samek 关于在嵌入式软件上下文中使用堆的问题的文章。
回答by Rob Lachlan
The choice of whether to allocate on the heap or on the stack is one that is made for you, depending on how your variable is allocated. If you allocate something dynamically, using a "new" call, you are allocating from the heap. If you allocate something as a global variable, or as a parameter in a function it is allocated on the stack.
是在堆上分配还是在堆栈上分配的选择是为您量身定做的,这取决于您的变量是如何分配的。如果您使用“new”调用动态分配某些内容,则您正在从堆中进行分配。如果您将某些内容分配为全局变量,或者作为函数中的参数分配,则它会在堆栈上分配。
回答by anand
In my opinion there are two deciding factors
在我看来有两个决定因素
1) Scope of variable
2) Performance.
I would prefer to use stack in most cases but if you need access to variable outside scope you can use heap.
在大多数情况下,我更喜欢使用堆栈,但如果您需要访问作用域外的变量,则可以使用堆。
To enhance performance while using heaps you can also use the functionality to create heap block and that can help in gaining performance rather than allocating each variable in different memory location.
要在使用堆时提高性能,您还可以使用该功能创建堆块,这有助于提高性能,而不是将每个变量分配到不同的内存位置。
回答by hAcKnRoCk
probably this has been answered quite well. I would like to point you to the below series of articles to have a deeper understanding of low level details. Alex Darby has a series of articles, where he walks you through with a debugger. Here is Part 3 about the Stack. http://www.altdevblogaday.com/2011/12/14/c-c-low-level-curriculum-part-3-the-stack/
可能这已经得到了很好的回答。我想向您指出以下系列文章,以便更深入地了解底层细节。Alex Darby 有一系列文章,他将带您了解调试器。这是关于堆栈的第 3 部分。 http://www.altdevblogaday.com/2011/12/14/cc-low-level-curriculum-part-3-the-stack/