windows 获取 C++ 函数的大小
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5655624/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Getting The Size of a C++ Function
提问by Adam
I was reading thisquestion because I'm trying to find the size of a function in a C++ program, It is hinted at that there may be a way that is platform specific. My targeted platform is windows
我正在阅读这个问题,因为我试图在 C++ 程序中找到函数的大小,暗示可能有一种特定于平台的方法。我的目标平台是 windows
The method I currently have in my head is the following:
1. Obtain a pointer to the function
2. Increment the Pointer (& counter) until I reach the machine code value for ret
3. The counter will be the size of the function?
我目前脑子里的方法如下:
1. 获得一个指向函数
的指针 2. 增加指针(& 计数器),直到我达到 3 的机器码值ret
。计数器将是函数的大小?
Edit1:To clarify what I mean by 'size' I mean the number of bytes (machine code) that make up the function.
Edit2:There have been a few comments asking why or what do I plan to do with this. The honest answer is I have no intention, and I can't really see the benefits of knowing a functions length pre-compile time. (although I'm sure there are some)
编辑 1:为了澄清我所说的“大小”是什么意思,我的意思是构成函数的字节数(机器代码)。
Edit2:有一些评论问我为什么或我打算用它做什么。诚实的回答是我无意,我真的看不出知道函数长度预编译时间的好处。(虽然我确定有一些)
This seems like a valid method to me, will this work?
这对我来说似乎是一种有效的方法,这行得通吗?
采纳答案by Michael Madsen
No, this will not work:
不,这行不通:
- There is no guarantee that your function only contains a single
ret
instruction. - Even if it only does contain a single
ret
, you can't just look at the individual bytes - because the corresponding value could appear as simply a value, rather than an instruction.
- 不能保证您的函数只包含一条
ret
指令。 - 即使它只包含单个
ret
,您也不能只查看单个字节 - 因为相应的值可能仅显示为一个值,而不是一条指令。
The first problem can possibly be worked around if you restrict your coding style to, say, only have a single point of return in your function, but the other basically requires a disassembler so you can tell the individual instructions apart.
如果您将编码风格限制为,例如,在您的函数中只有一个返回点,那么第一个问题可能可以解决,但另一个问题基本上需要一个反汇编器,以便您可以区分各个指令。
回答by Remus Rusanu
It is possible to obtain all blocks of a function, but is an unnatural question to ask what is the 'size' of a function. Optimized code will rearrange code blocks in the order of execution and will move seldom used blocks (exception paths) into outer parts of the module. For more details, see Profile-Guided Optimizationsfor example how Visual C++ achieves this in link time code generation. So a function can start at address 0x00001000, branch at 0x00001100 into a jump at 0x20001000 and a ret, and have some exception handling code 0x20001000. At 0x00001110 another function starts. What is the 'size' of your function? It does span from 0x00001000 to +0x20001000, but it 'owns' only few blocks in that span. So your question should be unasked.
可以获得一个函数的所有块,但是问一个函数的“大小”是多少是一个不自然的问题。优化后的代码将按照执行顺序重新排列代码块,并将很少使用的块(异常路径)移动到模块的外部。有关更多详细信息,请参阅Profile-Guided Optimizations,例如 Visual C++ 如何在链接时间代码生成中实现这一点。所以一个函数可以从地址 0x00001000 开始,在 0x00001100 处分支到 0x20001000 处的跳转和一个 ret,并有一些异常处理代码 0x20001000。在 0x00001110 处启动另一个函数。你的函数的“大小”是多少?它的范围确实从 0x00001000 到 +0x20001000,但它在那个范围内只“拥有”了几个块。所以你的问题应该是无人问津的。
There are other valid questions in this context, like the total number of instructions a function has (can be determined from the program symbol database and from the image), and more importantly, what is the number of instructions in the frequent executed code path inside the function. All these are questions normally asked in the context of performance measurement and there are tools that instrument code and can give very detailed answers.
在这个上下文中还有其他有效的问题,比如一个函数拥有的指令总数(可以从程序符号数据库和图像中确定),更重要的是,里面频繁执行的代码路径中的指令数是多少功能。所有这些都是在性能测量的上下文中通常会提出的问题,并且有一些工具可以测试代码并可以给出非常详细的答案。
Chasing pointers in memory and searching for ret
will get you nowhere I'm afraid. Modern code is way way way more complex than that.
ret
我担心在内存中追逐指针和搜索会让你无处可去。现代代码远比这复杂得多。
回答by Jordan
Wow, I use function size counting all the time and it has lots and lots of uses. Is it reliable? No way. Is it standard c++? No way. But that's why you need to check it in the disassembler to make sure it worked, every time that you release a new version. Compiler flags can mess up the ordering.
哇,我一直在使用函数大小计数,它有很多用途。它可靠吗?没门。它是标准的 C++ 吗?没门。但这就是为什么每次发布新版本时都需要在反汇编器中检查它以确保它有效的原因。编译器标志可能会弄乱顺序。
static void funcIwantToCount()
{
// do stuff
}
static void funcToDelimitMyOtherFunc()
{
__asm _emit 0xCC
__asm _emit 0xCC
__asm _emit 0xCC
__asm _emit 0xCC
}
int getlength( void *funcaddress )
{
int length = 0;
for(length = 0; *((UINT32 *)(&((unsigned char *)funcaddress)[length])) != 0xCCCCCCCC; ++length);
return length;
}
It seems to work better with static functions. Global optimizations can kill it.
使用静态函数似乎效果更好。全局优化可以杀死它。
P.S. I hate people, asking why you want to do this and it's impossible, etc. Stop asking these questions, please. Makes you sound stupid. Programmers are often asked to do non-standard things, because new products almost always push the limits of what's availble. If they don't, your product is probably a rehash of what's already been done. Boring!!!
PS 我讨厌人们,问你为什么要这样做,这是不可能的,等等。请停止问这些问题。让你听起来很愚蠢。程序员经常被要求做非标准的事情,因为新产品几乎总是突破可用的极限。如果他们不这样做,那么您的产品可能是对已经完成的工作的重新整理。无聊的!!!
回答by user541686
This won't work... what if there's a jump, a dummy ret
, and then the target of the jump? Your code will be fooled.
这行不通……如果有一个跳跃,一个 dummy ret
,然后是跳跃的目标呢?你的代码会被愚弄。
In general, it's impossibleto do this with 100% accuracy because you have to predict all code paths, which is like solving the halting problem. You can get "pretty good" accuracy if you implement your own disassembler, but no solution will be nearly as easy as you imagine.
一般来说,不可能以 100% 的准确度做到这一点,因为您必须预测所有代码路径,这就像解决停机问题一样。如果您实现自己的反汇编程序,您可以获得“相当不错”的准确性,但没有任何解决方案会像您想象的那样简单。
A "trick" would be to find out which function's code is afterthe function that you're looking for, which would give pretty good resultsassuming certain (dangerous) assumptions. But then you'd have to know what function comes after your function, which, after optimizations, is pretty hard to figure out.
一个“技巧”是找出哪个函数的代码在您要查找的函数之后,假设某些(危险的)假设,这将给出非常好的结果。但是你必须知道在你的函数之后是什么函数,在优化之后,这很难弄清楚。
Edit 1:
编辑1:
What if the function doesn't even end with a ret
instruction at all? It could very well just jmp
back to its caller (though it's unlikely).
如果函数甚至不带到底是什么ret
指令在所有?它很可能只是jmp
回到它的调用者(尽管不太可能)。
Edit 2:
编辑2:
Don't forget that x86, at least, has variable-length instructions...
别忘了 x86 至少有可变长度指令......
Update:
更新:
For those saying that flow analysis isn't the same as solving the halting problem:
对于那些说流分析与解决停机问题不同的人:
Consider what happens when you have code like:
考虑一下当你有这样的代码时会发生什么:
foo:
....
jmp foo
You willhave to follow the jump each time to figure out the end of the function, and you cannotignore it past the first time because you don't know whether or not you're dealing with self-modifying code. (You could have inline assembly in your C++ code that modifies itself, for instance.) It could very well extend to some other place of memory, so your analyzer will (or should) end in an infinite loop, unless you tolerate false negatives.
你会每次都跟着跳找出函数结束时,你不能忽视它过去的第一时间,因为你不知道你是否正在处理自修改代码。(例如,您可以在 C++ 代码中使用内联汇编来修改自身。)它可以很好地扩展到其他内存位置,因此您的分析器将(或应该)以无限循环结束,除非您容忍误报。
Isn't that like the halting problem?
这不就是停机问题吗?
回答by jkerian
The real solution to this is to dig into your compiler's documentation. The ARM compiler we use can be made to produce an assembly dump (code.dis), from which it's fairly trivial to subtract the offsets between a given mangled function label and the next mangled function label.
真正的解决方案是深入研究编译器的文档。我们使用的 ARM 编译器可以生成一个程序集转储 (code.dis),从中减去给定的被破坏的函数标签和下一个被破坏的函数标签之间的偏移量是相当简单的。
I'm not certain which tools you will need for this with a windows target, however. It looks like the tools listed in the answer to this questionmight be what you're looking for.
但是,我不确定使用 Windows 目标需要哪些工具。看起来这个问题的答案中列出的工具可能正是您要寻找的。
Also note that I (working in the embedded space) assumed you were talking about post-compile-analysis. It still might be possible to examine these intermediate files programmatically as part of a build provided that:
另请注意,我(在嵌入式领域工作)假设您在谈论编译后分析。作为构建的一部分,仍然可以通过编程方式检查这些中间文件,前提是:
- The target function is in a different object
- The build system has been taught the dependencies
- You know for sure that the compiler will build these object files
- 目标函数在不同的对象中
- 构建系统已经学会了依赖
- 你肯定知道编译器会构建这些目标文件
Note that I'm not sure entirely WHY you want to know this information. I've needed it in the past to be sure that I can fit a particular chunk of code in a very particular place in memory. I have to admit I'm curious what purpose this would have on a more general desktop-OS target.
请注意,我不完全确定您为什么想知道这些信息。过去我需要它来确保我可以将特定的代码块放入内存中非常特定的位置。我不得不承认我很好奇这对更通用的桌面操作系统目标有什么目的。
回答by Luke
This canwork in very limited scenarios. I use it in part of a code injection utility I wrote. I don't remember where I found the information, but I have the following (C++ in VS2005):
这可以在非常有限的情况下工作。我在我编写的代码注入实用程序的一部分中使用了它。我不记得我在哪里找到的信息,但我有以下(VS2005 中的 C++):
#pragma runtime_checks("", off)
static DWORD WINAPI InjectionProc(LPVOID lpvParameter)
{
// do something
return 0;
}
static DWORD WINAPI InjectionProcEnd()
{
return 0;
}
#pragma runtime_checks("", on)
And then in some other function I have:
然后在其他一些功能中我有:
size_t cbInjectionProc = (size_t)InjectionProcEnd - (size_t)InjectionProc;
You have to turn off some optimizations and declare the functions as static to get this to work; I don't recall the specifics. I don't know if this is an exact byte count, but it is close enough. The size is only that of the immediate function; it doesn't include any other functions that may be called by that function. Aside from extreme edge cases like this, "the size of a function" is meaningless and useless.
您必须关闭一些优化并将函数声明为静态函数才能使其工作;具体的我不记得了。我不知道这是否是一个确切的字节数,但它已经足够接近了。大小仅为立即函数的大小;它不包括该函数可能调用的任何其他函数。除了像这样的极端边缘情况,“函数的大小”是没有意义和无用的。
回答by MrOMGWTF
Just set PAGE_EXECUTE_READWRITE at the address where you got your function. Then read every byte. When you got byte "0xCC" it means that the end of function is actual_reading_address - 1.
只需在获得函数的地址处设置 PAGE_EXECUTE_READWRITE 即可。然后读取每个字节。当你得到字节“0xCC”时,这意味着函数的结尾是 actual_reading_address - 1。
回答by ahmd0
I'm posting this to say two things:
我发这个是为了说两件事:
1)Most of the answers given here are really bad and will break easily. If you use the C function pointer (using the function name), in a debug
build of your executable, and possibly in other circumstances, it may point to a JMP
shimthat will not have the function body itself. Here's an example. If I do the following for the function I defined below:
1)这里给出的大多数答案都非常糟糕并且很容易崩溃。如果您在debug
可执行文件的构建中使用 C 函数指针(使用函数名称),并且可能在其他情况下,它可能指向一个没有函数体本身的JMP
垫片。这是一个例子。如果我对下面定义的函数执行以下操作:
FARPROC pfn = (FARPROC)some_function_with_possibility_to_get_its_size_at_runtime;
the pfn
I get (for example: 0x7FF724241893
) will point to this, which is just a JMP
instruction:
在pfn
我得到(例如:0x7FF724241893
)将指向这一点,这仅仅是一个JMP
指令:
Additionally, a compiler can nest several of those shims, or branch your function code so that it will have multiple epilogs, or ret
instructions. Heck, it may not even use a ret
instruction. Then, there's no guarantee that functions themselves will be compiled and linked in the order you define them in the source code.
此外,编译器可以嵌套多个这些 shim,或分支您的函数代码,以便它具有多个结语或ret
指令。哎呀,它甚至可能不使用ret
指令。然后,无法保证函数本身会按照您在源代码中定义的顺序进行编译和链接。
You can do all that stuff in assemblylanguage, but not in C or C++.
你可以用汇编语言做所有这些事情,但不能用 C 或 C++。
2)So that above was the bad news. The good news is that the answer to the original question is, yes, there's a way(or a hack) to get the exact function size, but it comes with the following limitations:
2)所以上面是坏消息。好消息是原始问题的答案是,是的,有一种方法(或hack)可以获取确切的函数大小,但它具有以下限制:
It works in 64-bit executables on Windows only.
It is obviously Microsoft specific and is not portable.
You have to do this at run-time.
它仅适用于 Windows 上的 64 位可执行文件。
它显然是 Microsoft 特定的,不可移植。
您必须在运行时执行此操作。
The concept is simple -- utilize the way SEH is implementedin x64 Windows binaries. Compiler adds details of each function into the PE32+ header (into the IMAGE_DIRECTORY_ENTRY_EXCEPTION
directory of the optional header) that you can use to obtain the exact function size. (In case you're wondering, this information is used for catching, handling and unwindingof exceptions in the __try/__except/__finally
blocks.)
这个概念很简单——利用在 x64 Windows 二进制文件中实现 SEH的方式。编译器将每个函数的详细信息添加到 PE32+ 头文件中(到IMAGE_DIRECTORY_ENTRY_EXCEPTION
可选头文件的目录中),您可以使用它来获取确切的函数大小。(如果您想知道,此信息用于捕获、处理和解除__try/__except/__finally
块中的异常。)
Here's a quick example:
这是一个快速示例:
//You will have to call this when your app initializes and then
//cache the size somewhere in the global variable because it will not
//change after the executable image is built.
size_t fn_size; //Will receive function size in bytes, or 0 if error
some_function_with_possibility_to_get_its_size_at_runtime(&fn_size);
and then:
进而:
#include <Windows.h>
//The function itself has to be defined for two types of a call:
// 1) when you call it just to get its size, and
// 2) for its normal operation
bool some_function_with_possibility_to_get_its_size_at_runtime(size_t* p_getSizeOnly = NULL)
{
//This input parameter will define what we want to do:
if(!p_getSizeOnly)
{
//Do this function's normal work
//...
return true;
}
else
{
//Get this function size
//INFO: Works only in 64-bit builds on Windows!
size_t nFnSz = 0;
//One of the reasons why we have to do this at run-time is
//so that we can get the address of a byte inside
//the function body... we'll get it as this thread context:
CONTEXT context = {0};
RtlCaptureContext(&context);
DWORD64 ImgBase = 0;
RUNTIME_FUNCTION* pRTFn = RtlLookupFunctionEntry(context.Rip, &ImgBase, NULL);
if(pRTFn)
{
nFnSz = pRTFn->EndAddress - pRTFn->BeginAddress;
}
*p_getSizeOnly = nFnSz;
return false;
}
}
回答by Michael Chinen
I think it will work on windows programs created with msvc, as for branches the 'ret' seems to always come at the end (even if there are branches that return early it does a jne to go the end). However you will need some kind of disassembler library to figure the current opcode length as they are variable length for x86. If you don't do this you'll run into false positives.
我认为它适用于用 msvc 创建的 Windows 程序,至于分支,'ret' 似乎总是在最后(即使有分支提前返回,它也会执行 jne 到最后)。但是,您将需要某种反汇编程序库来计算当前的操作码长度,因为它们对于 x86 来说是可变长度的。如果你不这样做,你会遇到误报。
I would not be surprised if there are cases this doesn't catch.
如果有没有发现的情况,我不会感到惊讶。
回答by Thomas Matthews
There is no facilities in Standard C++ to obtain the size or length of a function.
See my answer here: Is it possible to load a function into some allocated memory and run it from there?
标准 C++ 中没有工具来获取函数的大小或长度。
在此处查看我的答案: 是否可以将函数加载到某个分配的内存中并从那里运行它?
In general, knowing the size of a function is used in embedded systems when copying executable code from a read-only source (or a slow memory device, such as a serial Flash) into RAM. Desktop and other operating systems load functions into memory using other techniques, such as dynamic or shared libraries.
通常,在嵌入式系统中,当将可执行代码从只读源(或慢速存储设备,例如串行闪存)复制到 RAM 中时,会使用知道函数的大小。桌面和其他操作系统使用其他技术(例如动态或共享库)将函数加载到内存中。