C++ 到底什么是基指针和堆栈指针?他们指的是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1395591/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 19:51:17  来源:igfitidea点击:

What is exactly the base pointer and stack pointer? To what do they point?

c++cassemblyx86

提问by devoured elysium

Using this examplecoming from wikipedia, in which DrawSquare() calls DrawLine(),

使用来自维基百科的这个例子,其中 DrawSquare() 调用 DrawLine(),

alt text

替代文字

(Note that this diagram has high addresses at the bottom and low addresses at the top.)

(请注意,此图底部的高地址和顶部的低地址。)

Could anyone explain me what ebpand espare in this context?

谁能解释一下在这种情况下是什么ebpesp是什么?

From what I see, I'd say the stack pointer points always to the top of the stack, and the base pointer to the beginning of the the current function? Or what?

从我所见,我会说堆栈指针总是指向堆栈的顶部,而基指针指向当前函数的开头?或者是什么?



edit: I mean this in the context of windows programs

编辑:我的意思是在 Windows 程序的上下文中

edit2: And how does eipwork, too?

编辑2:也是如何eip工作的?

edit3:I have the following code from MSVC++:

edit3:我有以下来自 MSVC++ 的代码:

var_C= dword ptr -0Ch
var_8= dword ptr -8
var_4= dword ptr -4
hInstance= dword ptr  8
hPrevInstance= dword ptr  0Ch
lpCmdLine= dword ptr  10h
nShowCmd= dword ptr  14h

All of them seem to be dwords, thus taking 4 bytes each. So I can see there is a gap from hInstance to var_4 of 4 bytes. What are they? I assume it is the return address, as can be seen in wikipedia's picture?

它们似乎都是双字,因此每个占用 4 个字节。所以我可以看到从 hInstance 到 4 个字节的 var_4 之间存在差距。这些是什么?我假设它是返回地址,如维基百科的图片所示?



(editor's note: removed a long quote from Michael's answer, which doesn't belong in the question, but a followup question was edited in):

(编者注:从迈克尔的回答中删除了一个长引用,它不属于问题,但后续问题被编辑了):

This is because the flow of the function call is:

这是因为函数调用的流程是:

* Push parameters (hInstance, etc.)
* Call function, which pushes return address
* Push ebp
* Allocate space for locals

My question (last, i hope!) now is, what is exactly what happens from the instant I pop the arguments of the function i want to call up to the end of the prolog? I want to know how the ebp, esp evolve during those moments(I already understood how the prolog works, I just want to know what is happening after i pushed the arguments on the stack and before the prolog).

我的问题(最后,我希望!)现在是,从我弹出要调用的函数的参数到序言结束的那一刻到底发生了什么?我想知道在那些时刻 ebp、esp 是如何演变的(我已经了解了 prolog 的工作原理,我只想知道在我将参数推送到堆栈上之后和 prolog 之前发生了什么)。

回答by Michael

espis as you say it is, the top of the stack.

esp就像你说的那样,栈顶。

ebpis usually set to espat the start of the function. Function parameters and local variables are accessed by adding and subtracting, respectively, a constant offset from ebp. All x86 calling conventions define ebpas being preserved across function calls. ebpitself actually points to the previous frame's base pointer, which enables stack walking in a debugger and viewing other frames local variables to work.

ebp通常esp在函数开始时设置为。函数参数和局部变量是通过分别从 中加上和减去一个常数偏移量来访问的ebp。所有 x86 调用约定都定义ebp为跨函数调用保留。 ebp它本身实际上指向前一帧的基指针,这使得调试器中的堆栈遍历和查看其他帧局部变量起作用。

Most function prologs look something like:

大多数函数序言看起来像:

push ebp      ; Preserve current frame pointer
mov ebp, esp  ; Create new frame pointer pointing to current stack top
sub esp, 20   ; allocate 20 bytes worth of locals on stack.

Then later in the function you may have code like (presuming both local variables are 4 bytes)

然后在函数的后面你可能有类似的代码(假设两个局部变量都是 4 个字节)

mov [ebp-4], eax    ; Store eax in first local
mov ebx, [ebp - 8]  ; Load ebx from second local

FPO or frame pointer omissionoptimization which you can enable will actually eliminate this and use ebpas another register and access locals directly off of esp, but this makes debugging a bit more difficult since the debugger can no longer directly access the stack frames of earlier function calls.

您可以启用的FPO 或帧指针省略优化实际上会消除这种情况,并将其ebp用作另一个寄存器并直接从 访问局部esp变量,但这会使调试变得更加困难,因为调试器无法再直接访问早期函数调用的堆栈帧。

EDIT:

编辑:

For your updated question, the missing two entries in the stack are:

对于您更新的问题,堆栈中缺少的两个条目是:

var_C = dword ptr -0Ch
var_8 = dword ptr -8
var_4 = dword ptr -4
*savedFramePointer = dword ptr 0*
*return address = dword ptr 4*
hInstance = dword ptr  8h
PrevInstance = dword ptr  0C
hlpCmdLine = dword ptr  10h
nShowCmd = dword ptr  14h

This is because the flow of the function call is:

这是因为函数调用的流程是:

  • Push parameters (hInstance, etc.)
  • Call function, which pushes return address
  • Push ebp
  • Allocate space for locals
  • 推送参数(hInstance等)
  • 调用函数,推送返回地址
  • ebp
  • 为当地人分配空间

回答by David R Tribble

ESP is the current stack pointer, which will change any time a word or address is pushed or popped onto/off off the stack. EBP is a more convenient way for the compiler to keep track of a function's parameters and local variables than using the ESP directly.

ESP 是当前堆栈指针,每当一个字或地址被推入或弹出堆栈时,它就会改变。EBP 是编译器跟踪函数参数和局部变量的一种比直接使用 ESP 更方便的方式。

Generally (and this may vary from compiler to compiler), all of the arguments to a function being called are pushed onto the stack by the calling function (usually in the reverse order that they're declared in the function prototype, but this varies). Then the function is called, which pushes the return address (EIP) onto the stack.

通常(这可能因编译器而异),被调用函数的所有参数都被调用函数压入堆栈(通常与它们在函数原型中声明的顺序相反,但这会有所不同) . 然后调用该函数,将返回地址 (EIP) 压入堆栈。

Upon entry to the function, the old EBP value is pushed onto the stack and EBP is set to the value of ESP. Then the ESP is decremented (because the stack grows downward in memory) to allocate space for the function's local variables and temporaries. From that point on, during the execution of the function, the arguments to the function are located on the stack at positive offsets from EBP (because they were pushed prior to the function call), and the local variables are located at negative offsets from EBP (because they were allocated on the stack after the function entry). That's why the EBP is called the frame pointer, because it points to the center of the function call frame.

在进入函数时,旧的 EBP 值被压入堆栈,并且 EBP 被设置为 ESP 的值。然后 ESP 递减(因为堆栈在内存中向下增长)以为函数的局部变量和临时变量分配空间。从那时起,在函数执行期间,函数的参数位于堆栈中与 EBP 的正偏移量处(因为它们在函数调用之前被压入),而局部变量位于 EBP 的负偏移量处(因为它们是在函数入口之后在堆栈上分配的)。这就是 EBP 被称为帧指针的原因,因为它指向函数调用帧的中心。

Upon exit, all the function has to do is set ESP to the value of EBP (which deallocates the local variables from the stack, and exposes the entry EBP on the top of the stack), then pop the old EBP value from the stack, and then the function returns (popping the return address into EIP).

退出时,该函数所要做的就是将 ESP 设置为 EBP 的值(从堆栈中释放局部变量,并在堆栈顶部公开入口 EBP),然后从堆栈中弹出旧的 EBP 值,然后函数返回(将返回地址弹出到 EIP 中)。

Upon returning back to the calling function, it can then increment ESP in order to remove the function arguments it pushed onto the stack just prior to calling the other function. At this point, the stack is back in the same state it was in prior to invoking the called function.

在返回到调用函数时,它可以增加 ESP,以便在调用另一个函数之前删除它推送到堆栈上的函数参数。此时,堆栈恢复到调用被调用函数之前的状态。

回答by Robert Cartaino

You have it right. The stack pointer points to the top item on the stack and the base pointer points to the "previous" top of the stackbefore the function was called.

你说得对。在调用函数之前,堆栈指针指向堆栈的顶部项目,基指针指向堆栈的“前一个”顶部

When you call a function, any local variable will be stored on the stack and the stack pointer will be incremented. When you return from the function, all the local variables on the stack go out of scope. You do this by setting the stack pointer back to the base pointer (which was the "previous" top before the function call).

调用函数时,任何局部变量都将存储在堆栈中,并且堆栈指针将递增。当您从函数返回时,堆栈上的所有局部变量都超出范围。您可以通过将堆栈指针设置回基指针(这是函数调用之前的“前一个”顶部)来完成此操作。

Doing memory allocation this way is very, veryfast and efficient.

这样的内存分配这种方式是非常非常快速,高效。

回答by wigy

EDIT:For a better description, see x86 Disassembly/Functions and Stack Framesin a WikiBook about x86 assembly. I try to add some info you might be interested in using Visual Studio.

编辑:要获得更好的描述,请参阅有关 x86 汇编的 WikiBook 中的x86 反汇编/函数和堆栈帧。我尝试添加一些您可能对使用 Visual Studio 感兴趣的信息。

Storing the caller EBP as the first local variable is called a standard stack frame, and this may be used for nearly all calling conventions on Windows. Differences exist whether the caller or callee deallocates the passed parameters, and which parameters are passed in registers, but these are orthogonal to the standard stack frame problem.

将调用者 EBP 存储为第一个局部变量称为标准堆栈帧,这几乎可用于 Windows 上的所有调用约定。调用者或被调用者是否释放传递的参数以及哪些参数在寄存器中传递都存在差异,但这些与标准堆栈帧问题是正交的。

Speaking about Windows programs, you might probably use Visual Studio to compile your C++ code. Be aware that Microsoft uses an optimization called Frame Pointer Omission, that makes it nearly impossible to do walk the stack without using the dbghlp library and the PDB file for the executable.

说到 Windows 程序,您可能会使用 Visual Studio 来编译 C++ 代码。请注意,Microsoft 使用一种称为帧指针省略的优化,这使得在不使用可执行文件的 dbghlp 库和 PDB 文件的情况下几乎不可能遍历堆栈。

This Frame Pointer Omission means that the compiler does not store the old EBP on a standard place and uses the EBP register for something else, therefore you have hard time finding the caller EIP without knowing how much space the local variables need for a given function. Of course Microsoft provides an API that allows you to do stack-walks even in this case, but looking up the symbol table database in PDB files takes too long for some use cases.

这种帧指针省略意味着编译器不会将旧的 EBP 存储在标准位置,而是将 EBP 寄存器用于其他用途,因此您很难在不知道给定函数的局部变量需要多少空间的情况下找到调用者 EIP。当然,Microsoft 提供了一个 API,即使在这种情况下,您也可以使用该 API 进行堆栈遍历,但是在某些用例中,在 PDB 文件中查找符号表数据库花费的时间太长。

To avoid FPO in your compilation units, you need to avoid using /O2 or need to explicitly add /Oy- to the C++ compilation flags in your projects. You probably link against the C or C++ runtime, which uses FPO in the Release configuration, so you will have hard time to do stack walks without the dbghlp.dll.

为避免在编译单元中使用 FPO,您需要避免使用 /O2 或需要将 /Oy- 显式添加到项目中的 C++ 编译标志。您可能链接到 C 或 C++ 运行时,它在 Release 配置中使用 FPO,因此您将很难在没有 dbghlp.dll 的情况下进行堆栈遍历。

回答by jmucchiello

First of all, the stack pointer points to the bottom of the stack since x86 stacks build from high address values to lower address values. The stack pointer is the point where the next call to push (or call) will place the next value. It's operation is equivalent to the C/C++ statement:

首先,堆栈指针指向堆栈底部,因为 x86 堆栈从高地址值构建到低地址值。堆栈指针是下一次 push(或 call)调用将放置下一个值的点。它的操作等价于C/C++语句:

 // push eax
 --*esp = eax
 // pop eax
 eax = *esp++;

 // a function call, in this case, the caller must clean up the function parameters
 move eax,some value
 push eax
 call some address  // this pushes the next value of the instruction pointer onto the
                    // stack and changes the instruction pointer to "some address"
 add esp,4 // remove eax from the stack

 // a function
 push ebp // save the old stack frame
 move ebp, esp
 ... // do stuff
 pop ebp  // restore the old stack frame
 ret

The base pointer is top of the current frame. ebp generally points to your return address. ebp+4 points to the first parameter of your function (or the this value of a class method). ebp-4 points to the first local variable of your function, usually the old value of ebp so you can restore the prior frame pointer.

基指针位于当前帧的顶部。ebp 通常指向您的退货地址。ebp+4 指向函数的第一个参数(或类方法的 this 值)。ebp-4 指向函数的第一个局部变量,通常是 ebp 的旧值,因此您可以恢复先前的帧指针。

回答by Wim ten Brink

Long time since I've done Assembly programming, but this linkmight be useful...

自从我完成汇编编程以来已经很长时间了,但是此链接可能有用...

The processor has a collection of registers which are used to store data. Some of these are direct values while others are pointing to an area within RAM. Registers do tend to be used for certain specific actions and every operand in assembly will require a certain amount of data in specific registers.

处理器有一组用于存储数据的寄存器。其中一些是直接值,而另一些则指向 RAM 中的某个区域。寄存器确实倾向于用于某些特定的操作,并且汇编中的每个操作数都需要特定寄存器中的一定数量的数据。

The stack pointer is mostly used when you're calling other procedures. With modern compilers, a bunch of data will be dumped first on the stack, followed by the return address so the system will know where to return once it's told to return. The stack pointer will point at the next location where new data can be pushed to the stack, where it will stay until it's popped back again.

堆栈指针主要在您调用其他过程时使用。使用现代编译器,一堆数据将首先转储到堆栈上,然后是返回地址,因此一旦系统被告知返回,系统就会知道返回何处。堆栈指针将指向下一个可以将新数据推入堆栈的位置,它将一直停留在该位置直到再次弹出。

Base registers or segment registers just point to the address space of a large amount of data. Combined with a second regiser, the Base pointer will divide the memory in huge blocks while the second register will point at an item within this block. Base pointers therefor point to the base of blocks of data.

基址寄存器或段寄存器只是指向大量数据的地址空间。结合第二个寄存器,基址指针将内存划分为大块,而第二个寄存器将指向该块中的一个项目。因此基指针指向数据块的基。

Do keep in mind that Assembly is very CPU specific. The page I've linked to provides information about different types of CPU's.

请记住,程序集非常特定于 CPU。我链接到的页面提供了有关不同类型 CPU 的信息。

回答by Stephen Friederichs

EditYeah, this is mostly wrong. It describes something entirely different in case anyone is interested :)

编辑是的,这主要是错误的。它描述了一些完全不同的东西,以防有人感兴趣:)

Yes, the stack pointer points to the top of the stack (whether that's the first empty stack location or the last full one I'm unsure of). The base pointer points to the memory location of the instruction that's being executed. This is on the level of opcodes - the most basic instruction you can get on a computer. Each opcode and its parameters is stored in a memory location. One C or C++ or C# line could be translated to one opcode, or a sequence of two or more depending on how complex it is. These are written into program memory sequentially and executed. Under normal circumstances the base pointer is incremented one instruction. For program control (GOTO, IF, etc) it can be incremented multiple times or just replaced with the next memory address.

是的,堆栈指针指向堆栈的顶部(我不确定这是第一个空堆栈位置还是最后一个完整堆栈位置)。基指针指向正在执行的指令的内存位置。这是操作码级别的 - 您可以在计算机上获得的最基本的指令。每个操作码及其参数都存储在一个内存位置。一个 C 或 C++ 或 C# 行可以转换为一个操作码,或者两个或更多的序列,具体取决于它的复杂程度。它们按顺序写入程序存储器并执行。在正常情况下,基指针增加一条指令。对于程序控制(GOTO、IF 等),它可以多次递增或仅替换为下一个内存地址。

In this context, the functions are stored in program memory at a certain address. When the function is called, certain information is pushed on the stack that lets the program find its was back to where the function was called from as well as the parameters to the function, then the address of the function in program memory is pushed into the base pointer. On the next clock cycle the computer starts executing instructions from that memory address. Then at some point it will RETURN to the memory location AFTER the instruction that called the function and continue from there.

在这种情况下,函数存储在程序存储器中的某个地址。当函数被调用时,某些信息被压入堆栈,让程序找到它回到调用函数的位置以及函数的参数,然后将程序内存中的函数地址压入堆栈基指针。在下一个时钟周期,计算机开始从该内存地址执行指令。然后在某个时候它会在调用函数的指令之后返回到内存位置并从那里继续。

回答by Adarsha Kharel

esp stands for "Extended Stack Pointer".....ebp for "Something Base Pointer"....and eip for "Something Instruction Pointer"...... The stack Pointer points to the offset address of the stack segment. The Base Pointer points to the offset address of the extra segment. The Instruction Pointer points to the offset address of the code segment. Now, about the segments...they are small 64KB divisions of the processors memory area.....This process is known as Memory Segmentation. I hope this post was helpful.

esp 代表“扩展堆栈指针”.....ebp 代表“Something Base Pointer”....eip 代表“Something Instruction Pointer”...... stack Pointer 指向堆栈段的偏移地址. 基指针指向额外段的偏移地址。指令指针指向代码段的偏移地址。现在,关于段......它们是处理器内存区域的 64KB 小分区......这个过程被称为内存分段。我希望这篇文章有帮助。