Linux 如何在没有任何符号信息的情况下找到elf可执行文件的主函数入口点?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9885545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 05:25:37  来源:igfitidea点击:

How to find the main function's entry point of elf executable file without any symbolic information?

linuxreverseelf

提问by Lucky Man

I developed a small cpp program on platform of Ubuntu-Linux 11.10. Now I want to reverse engineer it. I am beginner. I use such tools: GDB 7.0, hte editor, hexeditor.

我在 Ubuntu-Linux 11.10 平台上开发了一个小的 cpp 程序。现在我想对其进行逆向工程。我是初学者。我使用这样的工具:GDB 7.0, hte editor, hexeditor

For the first time I made it pretty easy. With help of symbolic information I founded the address of main function and made everything I needed. Then I striped (--strip-all) executable elf-file and I have some problems. I know that mainfunction starts from 0x8960 in this program. But I haven't any idea how should I find this point without this knowledge. I tried debug my program step by step with gdb but it goes into __libc_start_mainthen into the ld-linux.so.3(so, it finds and loads the shared libraries needed by a program). I debugged it about 10 minutes. Of course, may be in 20 minutes I can reach the main function's entry point, but, it seems, that more easy way has to exist.

我第一次让它变得非常容易。在符号信息的帮助下,我创建了 main 函数的地址并制作了我需要的一切。然后我对 ( --strip-all) 可执行的 elf 文件进行了条带化处理,但遇到了一些问题。我知道main这个程序中的函数是从 0x8960 开始的。但我不知道在没有这些知识的情况下我应该如何找到这一点。我尝试使用 gdb 逐步调试我的程序,但它进入 __libc_start_main然后进入ld-linux.so.3(因此,它找到并加载程序所需的共享库)。我调试了大约 10 分钟。当然,可能在 20 分钟内我就可以到达 main 函数的入口点,但是,似乎必须存在更简单的方法。

What should I do to find the mainfunction's entry point without any symbolic info? Could you advise me some good books/sites/other_sources from reverse engineering of elf-files with help of gdb? Any help would be appreciated.

我应该怎么做才能找到main没有任何符号信息的函数入口点?你能在 gdb 的帮助下从 elf 文件的逆向工程中向我推荐一些好书/网站/other_sources 吗?任何帮助,将不胜感激。

回答by jkoshy

As far as I know, once a program has been stripped, there is no straightforward way to locate the function that the symbol mainwould have otherwise referenced.

据我所知,一旦程序被剥离,就没有直接的方法来定位该符号main本来会引用的函数。

The value of the symbol mainis not required for program start-up: in the ELF format, the start of the program is specified by the e_entryfield of the ELF executable header. This field normally points to the C library's initialization code, and not directly to main.

main程序启动不需要符号的值:在ELF格式中,程序的开始由e_entryELF可执行文件头的字段指定。该字段通常指向 C 库的初始化代码,而不是直接指向main.

While the C library's initialization code does call main()after it has set up the C run time environment, this call is a normal function call that gets fully resolved at link time.

虽然 C 库的初始化代码main()在设置 C 运行时环境后会调用,但此调用是正常的函数调用,在链接时完全解析。

In some cases, implementation-specific heuristics (i.e., the specific knowledge of the internals of the C runtime) could be used to determine the location of mainin a stripped executable. However, I am not aware of a portable way to do so.

在某些情况下,特定于实现的试探法(即 C 运行时内部的特定知识)可用于确定main剥离的可执行文件中的位置。但是,我不知道这样做的便携方式。

回答by Breno Leit?o

If you have a very stripped version, or even a binary that is packed, as using UPX, you can gdb on it in the tough way as:

如果你有一个非常精简的版本,或者甚至是一个打包的二进制文件,比如使用 UPX,你可以以艰难的方式对其进行 gdb,如下所示:

$ readelf -h echo | grep Entry
Entry point address:               0x103120

And then you can break at it in GDB as:

然后你可以在 GDB 中打破它:

$ gdb mybinary
(gdb) break * 0x103120
Breakpoint 1 at 0x103120gdb) 
(gdb) r
Starting program: mybinary 
Breakpoint 1, 0x0000000000103120 in ?? ()

and then, you can see the entry instructions:

然后,您可以看到输入说明:

(gdb) x/10i 0x0000000000103120
=> 0x103120:    bl      0x103394
  0x103124: dcbtst  0,r5
  0x103128: mflr    r13
  0x10312c: cmplwi  r7,2
  0x103130: bne     0x103214
  0x103134: stw     r5,0(r6)
  0x103138: add     r4,r4,r3
  0x10313c: lis     r0,-32768
  0x103140: lis     r9,-32768
  0x103144: addi    r3,r3,-1

I hope it helps

我希望它有帮助

回答by julian

Locating main()in a stripped Linux ELF binary is straightforward. No symbol information is required.

main()在剥离的 Linux ELF 二进制文件中定位很简单。不需要符号信息。

The prototype for __libc_start_mainis

的原型__libc_start_main

int __libc_start_main(int (*main) (int, char**, char**), 
                      int argc, 
                      char *__unbounded *__unbounded ubp_av, 
                      void (*init) (void), 
                      void (*fini) (void), 
                      void (*rtld_fini) (void), 
                      void (*__unbounded stack_end));

The runtime memory address of main()is the argument corresponding to the first parameter, int (*main) (int, char**, char**). This means that the last memory address saved on the runtime stack prior to calling __libc_start_mainis the memory address of main(), since arguments are pushed onto the runtime stack in the reverse order of their corresponding parameters in the function definition.

的运行时内存地址main()是第一个参数对应的参数,int (*main) (int, char**, char**)。这意味着在调用之前保存在运行时堆栈上的最后一个内存地址__libc_start_main是 的内存地址main(),因为参数在函数定义中按照与它们对应的参数相反的顺序推送到运行时堆栈上。

One can enter main()in gdbin 4 steps:

一个可以进入main()gdb4个步骤:

  1. Find the program entry point
  2. Find where __libc_start_mainis called
  3. Set a break point to the address last saved on stack prior to the call to _libc_start_main
  4. Let program execution continueuntil the break point for main()is hit
  1. 找到程序入口点
  2. 找到__libc_start_main被调用的地方
  3. 将断点设置为调用前最后保存在堆栈中的地址 _libc_start_main
  4. 让程序执行,continue直到main()遇到断点

The process is the same for both 32-bit and 64-bit ELF binaries.

对于 32 位和 64 位 ELF 二进制文件,该过程是相同的。

Entering main()in an example stripped 32-bit ELF binary called "test_32":

输入main()一个名为“test_32”的剥离 32 位 ELF 二进制示例:

$ gdb -q -nh test_32
Reading symbols from test_32...(no debugging symbols found)...done.
(gdb) info file                                  #step 1
Symbols from "/home/c/test_32".
Local exec file:
    `/home/c/test_32', file type elf32-i386.
    Entry point: 0x8048310
    < output snipped >
(gdb) break *0x8048310
Breakpoint 1 at 0x8048310
(gdb) run
Starting program: /home/c/test_32 

Breakpoint 1, 0x08048310 in ?? ()
(gdb) x/13i $eip                                 #step 2
=> 0x8048310:   xor    %ebp,%ebp
   0x8048312:   pop    %esi
   0x8048313:   mov    %esp,%ecx
   0x8048315:   and    
$ gdb -q -nh test_64
Reading symbols from test_64...(no debugging symbols found)...done.
(gdb) info file                                  # step 1
Symbols from "/home/c/test_64".
Local exec file:
    `/home/c/test_64', file type elf64-x86-64.
    Entry point: 0x400430
    < output snipped >
(gdb) break *0x400430
Breakpoint 1 at 0x400430
(gdb) run 
Starting program: /home/c/test_64 

Breakpoint 1, 0x0000000000400430 in ?? ()
(gdb) x/11i $rip                                 # step 2
=> 0x400430:    xor    %ebp,%ebp
   0x400432:    mov    %rdx,%r9
   0x400435:    pop    %rsi
   0x400436:    mov    %rsp,%rdx
   0x400439:    and    ##代码##xfffffffffffffff0,%rsp
   0x40043d:    push   %rax
   0x40043e:    push   %rsp
   0x40043f:    mov    ##代码##x4005c0,%r8
   0x400446:    mov    ##代码##x400550,%rcx
   0x40044d:    mov    ##代码##x400526,%rdi            # address of main()
   0x400454:    callq  0x400410 <__libc_start_main@plt>
(gdb) break *0x400526                            # step 3
Breakpoint 2 at 0x400526
(gdb) continue                                   # step 4
Continuing.

Breakpoint 2, 0x0000000000400526 in ?? ()        # now in main()
(gdb) print $rdi                                    
 = 1                                           # argc = 1
(gdb) x/s **(char ***) ($rsp+16)
0x7fffffffe35c: "/home/c/test_64"                # argv[0]
(gdb) 
xfffffff0,%esp 0x8048318: push %eax 0x8048319: push %esp 0x804831a: push %edx 0x804831b: push ##代码##x80484a0 0x8048320: push ##代码##x8048440 0x8048325: push %ecx 0x8048326: push %esi 0x8048327: push ##代码##x804840b # address of main() 0x804832c: call 0x80482f0 <__libc_start_main@plt> (gdb) break *0x804840b # step 3 Breakpoint 2 at 0x804840b (gdb) continue # step 4 Continuing. Breakpoint 2, 0x0804840b in ?? () # now in main() (gdb) x/x $esp+4 0xffffd110: 0x00000001 # argc = 1 (gdb) x/s **(char ***) ($esp+8) 0xffffd35c: "/home/c/test_32" # argv[0] (gdb)

Entering main()in an example stripped 64-bit ELF binary called "test_64":

输入main()一个名为“test_64”的剥离 64 位 ELF 二进制示例:

##代码##

A detailed treatment of program initialization and what occurs before main()is called and how to get to main()can be found be found in Patrick Horgan's tutorial "Linux x86 Program Start Up or - How the heck do we get to main()?"

可以在 Patrick Horgan 的教程“Linux x86 程序启动或 - 我们如何到达 main()?”中找到程序初始化的详细处理以及main()调用之前发生的事情以及如何到达main()