C++ 局部变量的内存可以在其作用域之外访问吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6441218/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Can a local variable's memory be accessed outside its scope?
提问by Avi Shukron
I have the following code.
我有以下代码。
#include <iostream>
int * foo()
{
int a = 5;
return &a;
}
int main()
{
int* p = foo();
std::cout << *p;
*p = 8;
std::cout << *p;
}
And the code is just running with no runtime exceptions!
并且代码只是在没有运行时异常的情况下运行!
The output was 58
输出是 58
How can it be? Isn't the memory of a local variable inaccessible outside its function?
怎么会这样?局部变量的内存不是在其函数之外无法访问吗?
回答by Eric Lippert
How can it be? Isn't the memory of a local variable inaccessible outside its function?
怎么会这样?局部变量的内存不是在其函数之外无法访问吗?
You rent a hotel room. You put a book in the top drawer of the bedside table and go to sleep. You check out the next morning, but "forget" to give back your key. You steal the key!
你租了一间旅馆房间。你把一本书放在床头柜最上面的抽屉里然后睡觉。您第二天早上退房,但“忘记”归还您的钥匙。你偷了钥匙!
A week later, you return to the hotel, do not check in, sneak into your old room with your stolen key, and look in the drawer. Your book is still there. Astonishing!
一周后,你回到酒店,没有办理入住手续,拿着偷来的钥匙潜入你的旧房间,并查看抽屉。你的书还在。惊人!
How can that be? Aren't the contents of a hotel room drawer inaccessible if you haven't rented the room?
怎么可能?如果您没有租过房间,酒店房间抽屉里的东西不是无法访问吗?
Well, obviously that scenario can happen in the real world no problem. There is no mysterious force that causes your book to disappear when you are no longer authorized to be in the room. Nor is there a mysterious force that prevents you from entering a room with a stolen key.
好吧,显然这种情况可以在现实世界中发生,没问题。当您不再被授权进入房间时,没有神秘的力量会导致您的书消失。也没有一种神秘的力量可以阻止您使用偷来的钥匙进入房间。
The hotel management is not requiredto remove your book. You didn't make a contract with them that said that if you leave stuff behind, they'll shred it for you. If you illegally re-enter your room with a stolen key to get it back, the hotel security staff is not requiredto catch you sneaking in. You didn't make a contract with them that said "if I try to sneak back into my room later, you are required to stop me." Rather, you signed a contract with them that said "I promise not to sneak back into my room later", a contract which you broke.
酒店管理层不需要删除您的书。你没有与他们签订合同,说如果你留下东西,他们会为你撕碎。如果你用偷来的钥匙非法重新进入你的房间取回,酒店保安不需要抓住你偷偷溜进去。你没有和他们签订合同说“如果我试图偷偷溜回待会儿,你得拦住我。” 相反,你和他们签了一份合同,上面写着“我保证以后不会偷偷溜进我的房间”,你违反了这份合同。
In this situation anything can happen. The book can be there -- you got lucky. Someone else's book can be there and yours could be in the hotel's furnace. Someone could be there right when you come in, tearing your book to pieces. The hotel could have removed the table and book entirely and replaced it with a wardrobe. The entire hotel could be just about to be torn down and replaced with a football stadium, and you are going to die in an explosion while you are sneaking around.
在这种情况下,任何事情都可能发生。这本书可以在那里——你很幸运。别人的书可能在那里,而你的可能在酒店的熔炉里。当你进来的时候,有人可能就在那里,把你的书撕成碎片。酒店本可以完全拆除桌子和预订,并用衣柜取而代之。整个酒店可能即将被拆除并取而代之的是一个足球场,而你在偷偷摸摸的时候会死于爆炸。
You don't know what is going to happen; when you checked out of the hotel and stole a key to illegally use later, you gave up the right to live in a predictable, safe world because youchose to break the rules of the system.
你不知道会发生什么;当您退房并偷了一把钥匙以供日后非法使用时,您就放弃了生活在一个可预测、安全的世界中的权利,因为您选择了打破系统规则。
C++ is not a safe language. It will cheerfully allow you to break the rules of the system. If you try to do something illegal and foolish like going back into a room you're not authorized to be in and rummaging through a desk that might not even be there anymore, C++ is not going to stop you. Safer languages than C++ solve this problem by restricting your power -- by having much stricter control over keys, for example.
C++ 不是一种安全的语言。它将愉快地让您打破系统规则。如果你试图做一些非法和愚蠢的事情,比如回到一个你没有被授权进入的房间并翻找一张可能不再存在的桌子,C++ 不会阻止你。比 C++ 更安全的语言通过限制你的能力来解决这个问题——例如,通过对键进行更严格的控制。
UPDATE
更新
Holy goodness, this answer is getting a lot of attention. (I'm not sure why -- I considered it to be just a "fun" little analogy, but whatever.)
天哪,这个答案引起了很多关注。(我不知道为什么——我认为这只是一个“有趣”的小类比,但无论如何。)
I thought it might be germane to update this a bit with a few more technical thoughts.
我认为用更多的技术思想来更新它可能是密切相关的。
Compilers are in the business of generating code which manages the storage of the data manipulated by that program. There are lots of different ways of generating code to manage memory, but over time two basic techniques have become entrenched.
编译器负责生成代码,这些代码管理由该程序操作的数据的存储。有许多不同的方法可以生成代码来管理内存,但随着时间的推移,两种基本技术已经变得根深蒂固。
The first is to have some sort of "long lived" storage area where the "lifetime" of each byte in the storage -- that is, the period of time when it is validly associated with some program variable -- cannot be easily predicted ahead of time. The compiler generates calls into a "heap manager" that knows how to dynamically allocate storage when it is needed and reclaim it when it is no longer needed.
第一种是拥有某种“长寿命”存储区域,其中存储中每个字节的“生命周期”——即它与某个程序变量有效关联的时间段——不能轻易提前预测时间。编译器生成对“堆管理器”的调用,该管理器知道如何在需要时动态分配存储并在不再需要时回收它。
The second method is to have a “short-lived” storage area where the lifetime of each byte is well known. Here, the lifetimes follow a “nesting” pattern. The longest-lived of these short-lived variables will be allocated before any other short-lived variables, and will be freed last. Shorter-lived variables will be allocated after the longest-lived ones, and will be freed before them. The lifetime of these shorter-lived variables is “nested” within the lifetime of longer-lived ones.
第二种方法是拥有一个“短期”存储区域,其中每个字节的生命周期都是众所周知的。在这里,生命周期遵循“嵌套”模式。这些短期变量中寿命最长的将在任何其他短期变量之前分配,并最后释放。寿命较短的变量将在寿命最长的变量之后分配,并在它们之前释放。这些寿命较短的变量的生命周期“嵌套”在寿命较长的变量的生命周期内。
Local variables follow the latter pattern; when a method is entered, its local variables come alive. When that method calls another method, the new method's local variables come alive. They'll be dead before the first method's local variables are dead. The relative order of the beginnings and endings of lifetimes of storages associated with local variables can be worked out ahead of time.
局部变量遵循后一种模式;当进入一个方法时,它的局部变量就会活跃起来。当该方法调用另一个方法时,新方法的局部变量就会活跃起来。它们会在第一个方法的局部变量死之前就死了。可以提前计算出与局部变量相关联的存储的生命周期开始和结束的相对顺序。
For this reason, local variables are usually generated as storage on a "stack" data structure, because a stack has the property that the first thing pushed on it is going to be the last thing popped off.
出于这个原因,局部变量通常作为存储在“堆栈”数据结构上生成,因为堆栈具有这样的属性,即第一个压入它的东西将是最后一个弹出的东西。
It's like the hotel decides to only rent out rooms sequentially, and you can't check out until everyone with a room number higher than you has checked out.
这就像酒店决定只按顺序出租房间,直到所有房间号高于您的人都退房后,您才能退房。
So let's think about the stack. In many operating systems you get one stack per thread and the stack is allocated to be a certain fixed size. When you call a method, stuff is pushed onto the stack. If you then pass a pointer to the stack back out of your method, as the original poster does here, that's just a pointer to the middle of some entirely valid million-byte memory block. In our analogy, you check out of the hotel; when you do, you just checked out of the highest-numbered occupied room. If no one else checks in after you, and you go back to your room illegally, all your stuff is guaranteed to still be there in this particular hotel.
所以让我们考虑一下堆栈。在许多操作系统中,每个线程都有一个堆栈,并且堆栈被分配为某个固定大小。当你调用一个方法时,东西被压入堆栈。如果你随后将一个指向堆栈的指针从你的方法中传回,就像原始海报在这里所做的那样,那只是一个指向某个完全有效的百万字节内存块中间的指针。在我们的比喻中,您从酒店退房;当您这样做时,您只是从人数最多的房间退房。如果没有其他人在您之后办理入住手续,并且您非法返回您的房间,那么您的所有东西都可以保证仍然在这家特定的酒店中。
We use stacks for temporary stores because they are really cheap and easy. An implementation of C++ is not required to use a stack for storage of locals; it could use the heap. It doesn't, because that would make the program slower.
我们将堆栈用于临时商店,因为它们非常便宜且容易。C++ 的实现不需要使用堆栈来存储局部变量;它可以使用堆。它不会,因为这会使程序变慢。
An implementation of C++ is not required to leave the garbage you left on the stack untouched so that you can come back for it later illegally; it is perfectly legal for the compiler to generate code that turns back to zero everything in the "room" that you just vacated. It doesn't because again, that would be expensive.
C++ 的实现不需要将您留在堆栈中的垃圾保持原样,以便您以后可以非法返回它;编译器生成的代码将您刚刚腾出的“房间”中的所有内容都归零是完全合法的。这不是因为再次,那将是昂贵的。
An implementation of C++ is not required to ensure that when the stack logically shrinks, the addresses that used to be valid are still mapped into memory. The implementation is allowed to tell the operating system "we're done using this page of stack now. Until I say otherwise, issue an exception that destroys the process if anyone touches the previously-valid stack page". Again, implementations do not actually do that because it is slow and unnecessary.
不需要 C++ 的实现来确保当堆栈逻辑收缩时,曾经有效的地址仍然映射到内存中。允许实现告诉操作系统“我们现在已经使用完这个堆栈页面了。除非我另有说明,否则如果有人触摸以前有效的堆栈页面,则发出一个破坏进程的异常”。同样,实现实际上并没有这样做,因为它很慢而且没有必要。
Instead, implementations let you make mistakes and get away with it. Most of the time. Until one day something truly awful goes wrong and the process explodes.
相反,实现会让你犯错并逃脱惩罚。大多数时候。直到有一天,真正可怕的事情出错了,这个过程爆炸了。
This is problematic. There are a lot of rules and it is very easy to break them accidentally. I certainly have many times. And worse, the problem often only surfaces when memory is detected to be corrupt billions of nanoseconds after the corruption happened, when it is very hard to figure out who messed it up.
这是有问题的。有很多规则,很容易不小心打破它们。我当然有很多次。更糟糕的是,当内存在损坏发生后数十亿纳秒被检测到损坏时,问题通常才会浮出水面,此时很难弄清楚是谁搞砸的。
More memory-safe languages solve this problem by restricting your power. In "normal" C# there simply is no way to take the address of a local and return it or store it for later. You can take the address of a local, but the language is cleverly designed so that it is impossible to use it after the lifetime of the local ends. In order to take the address of a local and pass it back, you have to put the compiler in a special "unsafe" mode, andput the word "unsafe" in your program, to call attention to the fact that you are probably doing something dangerous that could be breaking the rules.
更多的内存安全语言通过限制你的能力来解决这个问题。在“普通”C# 中,根本无法获取本地地址并将其返回或存储以备后用。您可以获取本地的地址,但是该语言设计得很巧妙,因此在本地结束后无法使用它。为了获取本地地址并将其传回,您必须将编译器置于特殊的“不安全”模式,并在您的程序中放置“不安全”一词,以引起注意您可能正在做的事实一些可能违反规则的危险事物。
For further reading:
进一步阅读:
What if C# did allow returning references? Coincidentally that is the subject of today's blog post:
https://ericlippert.com/2011/06/23/ref-returns-and-ref-locals/
Why do we use stacks to manage memory? Are value types in C# always stored on the stack? How does virtual memory work? And many more topics in how the C# memory manager works. Many of these articles are also germane to C++ programmers:
如果 C# 确实允许返回引用怎么办?巧合的是,这就是今天博客文章的主题:
https://ericlippert.com/2011/06/23/ref-returns-and-ref-locals/
为什么我们使用堆栈来管理内存?C# 中的值类型总是存储在堆栈中吗?虚拟内存是如何工作的?以及有关 C# 内存管理器如何工作的更多主题。其中许多文章也与 C++ 程序员密切相关:
回答by Rena
What you're doing here is simply reading and writing to memory that used tobe the address of a
. Now that you're outside of foo
, it's just a pointer to some random memory area. It just so happens that in your example, that memory area does exist and nothing else is using it at the moment. You don't break anything by continuing to use it, and nothing else has overwritten it yet. Therefore, the 5
is still there. In a real program, that memory would be re-used almost immediately and you'd break something by doing this (though the symptoms may not appear until much later!)
您在这里所做的只是读取和写入曾经是a
. 现在你在 之外foo
,它只是一个指向一些随机内存区域的指针。碰巧在您的示例中,该内存区域确实存在并且目前没有其他任何东西正在使用它。继续使用它不会破坏任何东西,而且还没有其他任何东西覆盖它。因此,5
它仍然存在。在真正的程序中,该内存几乎会立即被重用,这样做会破坏某些东西(尽管症状可能要到很晚才会出现!)
When you return from foo
, you tell the OS that you're no longer using that memory and it can be reassigned to something else. If you're lucky and it never does get reassigned, and the OS doesn't catch you using it again, then you'll get away with the lie. Chances are though you'll end up writing over whatever else ends up with that address.
当您从 返回时foo
,您告诉操作系统您不再使用该内存,并且可以将其重新分配给其他东西。如果你很幸运并且它永远不会被重新分配,并且操作系统没有发现你再次使用它,那么你就可以摆脱谎言。尽管您最终会写下以该地址结尾的任何其他内容。
Now if you're wondering why the compiler doesn't complain, it's probably because foo
got eliminated by optimization. It usually will warn you about this sort of thing. C assumes you know what you're doing though, and technically you haven't violated scope here (there's no reference to a
itself outside of foo
), only memory access rules, which only triggers a warning rather than an error.
现在如果你想知道为什么编译器不抱怨,那可能是因为foo
被优化淘汰了。它通常会警告你这种事情。C 假设您知道自己在做什么,并且从技术上讲,您没有在这里违反范围(在a
之外没有对自身的引用foo
),只有内存访问规则,这只会触发警告而不是错误。
In short: this won't usually work, but sometimes will by chance.
简而言之:这通常不会奏效,但有时会碰巧。
回答by msw
Because the storage space wasn't stomped on just yet. Don't count on that behavior.
因为存储空间还没有被踩踏。不要指望这种行为。
回答by Michael
A little addition to all the answers:
所有答案的一点补充:
if you do something like that:
如果你这样做:
#include<stdio.h>
#include <stdlib.h>
int * foo(){
int a = 5;
return &a;
}
void boo(){
int a = 7;
}
int main(){
int * p = foo();
boo();
printf("%d\n",*p);
}
the output probably will be: 7
输出可能是:7
That is because after returning from foo() the stack is freed and then reused by boo(). If you deassemble the executable you will see it clearly.
那是因为从 foo() 返回后,堆栈被释放,然后被 boo() 重用。如果您反汇编可执行文件,您将清楚地看到它。
回答by Charles Brunet
In C++, you canaccess any address, but it doesn't mean you should. The address you are accessing is no longer valid. It worksbecause nothing else scrambled the memory after foo returned, but it could crash under many circumstances. Try analyzing your program with Valgrind, or even just compiling it optimized, and see...
在 C++ 中,您可以访问任何地址,但这并不意味着您应该. 您正在访问的地址不再有效。它有效是因为在 foo 返回后没有其他东西扰乱了内存,但在许多情况下它可能会崩溃。尝试使用Valgrind分析您的程序,或者甚至只是对其进行优化编译,然后查看...
回答by Kerrek SB
You never throw a C++ exception by accessing invalid memory. You are just giving an example of the general idea of referencing an arbitrary memory location. I could do the same like this:
您永远不会通过访问无效内存来抛出 C++ 异常。您只是举例说明了引用任意内存位置的一般想法。我可以这样做:
unsigned int q = 123456;
*(double*)(q) = 1.2;
Here I am simply treating 123456 as the address of a double and write to it. Any number of things could happen:
在这里,我只是将 123456 视为 double 的地址并写入它。任何数量的事情都可能发生:
q
might in fact genuinely be a valid address of a double, e.g.double p; q = &p;
.q
might point somewhere inside allocated memory and I just overwrite 8 bytes in there.q
points outside allocated memory and the operating system's memory manager sends a segmentation fault signal to my program, causing the runtime to terminate it.- You win the lottery.
q
实际上可能真的是双精度的有效地址,例如double p; q = &p;
.q
可能指向分配的内存中的某个地方,我只是在那里覆盖了 8 个字节。q
指向分配的内存之外,操作系统的内存管理器向我的程序发送分段错误信号,导致运行时终止它。- 你中了彩票。
The way you set it up it is a bit more reasonable that the returned address points into a valid area of memory, as it will probably just be a little further down the stack, but it is still an invalid location that you cannot access in a deterministic fashion.
您设置它的方式将返回的地址指向一个有效的内存区域更为合理,因为它可能只是在堆栈的下方,但它仍然是一个无效的位置,您无法在确定性时尚。
Nobody will automatically check the semantic validity of memory addresses like that for you during normal program execution. However, a memory debugger such as valgrind
will happily do this, so you should run your program through it and witness the errors.
在正常程序执行过程中,没有人会像您那样自动检查内存地址的语义有效性。然而,像这样的内存调试器valgrind
会很乐意这样做,所以你应该通过它运行你的程序并观察错误。
回答by gastush
Did you compile your program with the optimiser enabled? The foo()
function is quite simple and might have been inlined or replaced in the resulting code.
您是否在启用优化器的情况下编译您的程序?该foo()
函数非常简单,可能已在结果代码中内联或替换。
But I agree with Mark B that the resulting behavior is undefined.
但我同意 Mark B 的观点,即由此产生的行为是未定义的。
回答by Chang Peng
Your problem has nothing to do with scope. In the code you show, the function main
does not see the names in the function foo
, so you can't access a
in foo directly with thisname outside foo
.
你的问题与scope无关。在您显示的代码中,该函数main
看不到function中的名称foo
,因此您无法a
在 foo 中直接使用此名称外部访问foo
。
The problem you are having is why the program doesn't signal an error when referencing illegal memory. This is because C++ standards does not specify a very clear boundary between illegal memory and legal memory. Referencing something in popped out stack sometimes causes error and sometimes not. It depends. Don't count on this behavior. Assume it will always result in error when you program, but assume it will never signal error when you debug.
您遇到的问题是为什么程序在引用非法内存时没有发出错误信号。这是因为 C++ 标准没有在非法内存和合法内存之间指定非常明确的界限。引用弹出堆栈中的某些内容有时会导致错误,有时不会。这取决于。不要指望这种行为。假设它在您编程时总是会导致错误,但假设它在您调试时永远不会发出错误信号。
回答by Brian R. Bondy
You are just returning a memory address, it's allowed but probably an error.
您只是返回一个内存地址,这是允许的,但可能是一个错误。
Yes if you try to dereference that memory address you will have undefined behavior.
是的,如果您尝试取消引用该内存地址,则会出现未定义的行为。
int * ref () {
int tmp = 100;
return &tmp;
}
int main () {
int * a = ref();
//Up until this point there is defined results
//You can even print the address returned
// but yes probably a bug
cout << *a << endl;//Undefined results
}
回答by sam
Pay attention to all warnings . Do not only solve errors.
GCC shows this Warning
注意所有警告。不要只解决错误。
GCC 显示此警告
warning: address of local variable 'a' returned
警告:返回了局部变量“a”的地址
This is power of C++. You should care about memory. With the -Werror
flag, this warning becames an error and now you have to debug it.
这就是 C++ 的力量。你应该关心内存。有了这个-Werror
标志,这个警告就变成了一个错误,现在你必须调试它。