如何在 Linux 中反汇编二进制可执行文件以获取汇编代码?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5125896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 00:24:31  来源:igfitidea点击:

How to disassemble a binary executable in Linux to get the assembly code?

c++linuxassemblyexecutabledisassembly

提问by Syntax_Error

I was told to use a disassembler. Does gcchave anything built in? What is the easiest way to do this?

有人告诉我使用反汇编程序。是否gcc有任何内置的?什么是最简单的方法来做到这一点?

采纳答案by Michael Mrozek

I don't think gcchas a flag for it, since it's primarily a compiler, but another of the GNU development tools does. objdumptakes a -d/--disassembleflag:

我认为gcc它没有标志,因为它主要是一个编译器,但另一个 GNU 开发工具有。objdump需要一个-d/--disassemble标志:

$ objdump -d /path/to/binary

The disassembly looks like this:

反汇编看起来是这样的:

080483b4 <main>:
 80483b4:   8d 4c 24 04             lea    0x4(%esp),%ecx
 80483b8:   83 e4 f0                and    
$ gdb -q ./a.out 
Reading symbols from ./a.out...(no debugging symbols found)...done.
(gdb) info functions 
All defined functions:

Non-debugging symbols:
0x00000000004003a8  _init
0x00000000004003e0  __libc_start_main@plt
0x00000000004003f0  __gmon_start__@plt
0x0000000000400400  _start
0x0000000000400430  deregister_tm_clones
0x0000000000400460  register_tm_clones
0x00000000004004a0  __do_global_dtors_aux
0x00000000004004c0  frame_dummy
0x00000000004004f0  fce
0x00000000004004fb  main
0x0000000000400510  __libc_csu_init
0x0000000000400580  __libc_csu_fini
0x0000000000400584  _fini
(gdb) disassemble main
Dump of assembler code for function main:
   0x00000000004004fb <+0>:     push   %rbp
   0x00000000004004fc <+1>:     mov    %rsp,%rbp
   0x00000000004004ff <+4>:     sub    
(gdb) disassemble /m main
Dump of assembler code for function main:
9       {
   0x00000000004004fb <+0>:     push   %rbp
   0x00000000004004fc <+1>:     mov    %rsp,%rbp
   0x00000000004004ff <+4>:     sub    
objconv  -fyasm a.out /dev/stdout | less
x10,%rsp 10 int x = fce (); 0x0000000000400503 <+8>: callq 0x4004f0 <fce> 0x0000000000400508 <+13>: mov %eax,-0x4(%rbp) 11 return x; 0x000000000040050b <+16>: mov -0x4(%rbp),%eax 12 } 0x000000000040050e <+19>: leaveq 0x000000000040050f <+20>: retq End of assembler dump. (gdb)
x10,%rsp 0x0000000000400503 <+8>: callq 0x4004f0 <fce> 0x0000000000400508 <+13>: mov %eax,-0x4(%rbp) 0x000000000040050b <+16>: mov -0x4(%rbp),%eax 0x000000000040050e <+19>: leaveq 0x000000000040050f <+20>: retq End of assembler dump. (gdb) disassemble fce Dump of assembler code for function fce: 0x00000000004004f0 <+0>: push %rbp 0x00000000004004f1 <+1>: mov %rsp,%rbp 0x00000000004004f4 <+4>: mov
; Filling space: 0FH
; Filler type: Multi-byte NOP
;       db 0FH, 1FH, 44H, 00H, 00H, 66H, 2EH, 0FH
;       db 1FH, 84H, 00H, 00H, 00H, 00H, 00H

ALIGN   16

foo:    ; Function begin
        cmp     rdi, 1                                  ; 00400620 _ 48: 83. FF, 01
        jbe     ?_026                                   ; 00400624 _ 0F 86, 00000084
        mov     r11d, 1                                 ; 0040062A _ 41: BB, 00000001
?_020:  mov     r8, r11                                 ; 00400630 _ 4D: 89. D8
        imul    r8, r11                                 ; 00400633 _ 4D: 0F AF. C3
        add     r8, rdi                                 ; 00400637 _ 49: 01. F8
        cmp     r8, 3                                   ; 0040063A _ 49: 83. F8, 03
        jbe     ?_029                                   ; 0040063E _ 0F 86, 00000097
        mov     esi, 1                                  ; 00400644 _ BE, 00000001
; Filling space: 7H
; Filler type: Multi-byte NOP
;       db 0FH, 1FH, 80H, 00H, 00H, 00H, 00H

ALIGN   8
?_021:  add     rsi, rsi                                ; 00400650 _ 48: 01. F6
        mov     rax, rsi                                ; 00400653 _ 48: 89. F0
        imul    rax, rsi                                ; 00400656 _ 48: 0F AF. C6
        shl     rax, 2                                  ; 0040065A _ 48: C1. E0, 02
        cmp     r8, rax                                 ; 0040065E _ 49: 39. C0
        jnc     ?_021                                   ; 00400661 _ 73, ED
        lea     rcx, [rsi+rsi]                          ; 00400663 _ 48: 8D. 0C 36
...
x2a,%eax 0x00000000004004f9 <+9>: pop %rbp 0x00000000004004fa <+10>: retq End of assembler dump. (gdb)
xfffffff0,%esp 80483bb: ff 71 fc pushl -0x4(%ecx) 80483be: 55 push %ebp 80483bf: 89 e5 mov %esp,%ebp 80483c1: 51 push %ecx 80483c2: b8 00 00 00 00 mov
  (from /lib/x86_64-linux-gnu/libc.so.6)

SECTION .plt    align=16 execute                        ; section number 11, code

?_00001:; Local function
        push    qword [rel ?_37996]                     ; 0001F420 _ FF. 35, 003A4BE2(rel)
        jmp     near [rel ?_37997]                      ; 0001F426 _ FF. 25, 003A4BE4(rel)

...    
ALIGN   8
?_00002:jmp     near [rel ?_37998]                      ; 0001F430 _ FF. 25, 003A4BE2(rel)

; Note: Immediate operand could be made smaller by sign extension
        push    11                                      ; 0001F436 _ 68, 0000000B
; Note: Immediate operand could be made smaller by sign extension
        jmp     ?_00001                                 ; 0001F43B _ E9, FFFFFFE0
x0,%eax 80483c7: 59 pop %ecx 80483c8: 5d pop %ebp 80483c9: 8d 61 fc lea -0x4(%ecx),%esp 80483cc: c3 ret 80483cd: 90 nop 80483ce: 90 nop 80483cf: 90 nop

回答by ta.speot.is

Use IDA Proand the Decompiler.

使用IDA Pro反编译器

回答by jcomeau_ictx

there's also ndisasm, which has some quirks, but can be more useful if you use nasm. I agree with Michael Mrozek that objdump is probably best.

还有 ndisasm,它有一些怪癖,但如果您使用 nasm,它会更有用。我同意 Michael Mrozek 的观点,objdump 可能是最好的。

[later] you might also want to check out Albert van der Horst's ciasdis: http://home.hccnet.nl/a.w.m.van.der.horst/forthassembler.html. it can be hard to understand, but has some interesting features you won't likely find anywhere else.

[稍后] 您可能还想查看 Albert van der Horst 的 ciasdis:http://home.hccnet.nl/awmvan.der.horst/forthassembler.html 。它可能难以理解,但具有一些您在其他任何地方都找不到的有趣功能。

回答by Anthony DeRosa

You might find ODA useful. It's a web-based disassembler that supports tons of architectures.

您可能会发现官方发展援助很有用。它是一个基于 Web 的反汇编程序,支持大量架构。

http://onlinedisassembler.com/

http://onlinedisassembler.com/

回答by Miroslav Franc

An interesting alternative to objdump is gdb. You don't have to run the binary or have debuginfo.

objdump 的一个有趣的替代品是 gdb。您不必运行二进制文件或具有 debuginfo。

objdump --no-show-raw-insn -Matt,att-mnemonic -Dz /bin/bash | grep -v "file format" | grep -v "(bad)" | sed '1,4d' | cut -d' ' -f2- | cut -d '<' -f2 | tr -d '>' | cut -f2- | sed -e "s/of\ section/#Disassembly\ of\ section/" | grep -v "\.\.\." > bash.S

With full debugging info it's even better.

有了完整的调试信息,它就更好了。

#include <iostream>

double foo(double x)
{
  asm("# MyTag BEGIN"); // <- asm comment,
                        //    used later to locate piece of code
  double y = 2 * x + 1;

  asm("# MyTag END");

  return y;
}

int main()
{
  std::cout << foo(2);
}

objdump has a similar option (-S)

objdump 有一个类似的选项 (-S)

回答by Peter Cordes

This answer is specific to x86. Portable tools that can disassemble AArch64, MIPS, or whatever machine code include objdumpand llvm-objdump.

此答案特定于 x86。可以反汇编 AArch64、MIPS 或任何机器代码的便携式工具,包括objdumpllvm-objdump.



Agner Fog's disassembler, objconv, is quite nice. It will add comments to the disassembly output for performance problems (like the dreaded LCP stall from instructions with 16bit immediate constants, for example).

Agner Fog 的反汇编器,objconv非常好。它将在反汇编输出中添加注释以解决性能问题(例如,来自具有 16 位立即数的指令的可怕的 LCP 停顿)。

 g++ prog.cpp -c -S -o - -masm=intel | c++filt | grep -vE '\s+\.'

(It doesn't recognize -as shorthand for stdout, and defaults to outputting to a file of similar name to the input file, with .asmtacked on.)

(它不会识别-为 stdout 的简写,并且默认输出到与输入文件名称相似的文件,并.asm附加。)

It also adds branch targets to the code. Other disassemblers usually disassemble jump instructions with just a numeric destination, and don't put any marker at a branch target to help you find the top of loops and so on.

它还向代码中添加了分支目标。其他反汇编程序通常只使用数字目标反汇编跳转指令,并且不要在分支目标上放置任何标记来帮助您找到循环的顶部等。

It also indicates NOPs more clearly than other disassemblers (making it clear when there's padding, rather than disassembling it as just another instruction.)

它还比其他反汇编程序更清楚地指示 NOP(在有填充时明确表示,而不是将其拆解为另一条指令。)

It's open source, and easy to compile for Linux. It can disassemble into NASM, YASM, MASM, or GNU (AT&T) syntax.

它是开源的,易于为 Linux 编译。它可以分解为 NASM、YASM、MASM 或 GNU (AT&T) 语法。

Sample output:

示例输出:

g++ prog.cpp -c -S -o - -masm=intel | c++filt | grep -vE '\s+\.' | grep "MyTag BEGIN" -A 20

Note that this output is ready to be assembled back into an object file, so you can tweak the code at the asm source level, rather than with a hex-editor on the machine code. (So you aren't limited to keeping things the same size.) With no changes, the result should be near-identical. It might not be, though, since disassembly of stuff like

请注意,此输出已准备好组装回目标文件,因此您可以在 asm 源代码级别调整代码,而不是使用机器代码上的十六进制编辑器。(因此,您不仅限于保持相同的大小。)如果没有任何更改,结果应该几乎相同。不过,它可能不是,因为像这样的东西的拆卸

    # MyTag BEGIN
# 0 "" 2
#NO_APP
    movsd   xmm0, QWORD PTR -24[rbp]
    movapd  xmm1, xmm0
    addsd   xmm1, xmm0
    addsd   xmm0, xmm1
    movsd   QWORD PTR -8[rbp], xmm0
#APP
# 9 "poub.cpp" 1
    # MyTag END
# 0 "" 2
#NO_APP
    movsd   xmm0, QWORD PTR -8[rbp]
    pop rbp
    ret
.LFE1814:
main:
.LFB1815:
    push    rbp
    mov rbp, rsp

doesn't have anything in the source to make sure it assembles to the longer encoding that leaves room for relocations to rewrite it with a 32bit offset.

源代码中没有任何内容来确保它组装为更长的编码,从而为重定位留出空间以使用 32 位偏移量重写它。



If you don't want to install it objconv, GNU binutils objdump -Mintel -dis very usable, and will already be installed if you have a normal Linux gcc setup.

如果你不想安装 objconv,GNU binutilsobjdump -Mintel -d非常有用,如果你有一个正常的 Linux gcc 设置,它已经安装了。

回答by arboreal84

ht editorcan disassemble binaries in many formats. It is similar to Hiew, but open source.

ht 编辑器可以反汇编多种格式的二进制文件。它类似于 Hiew,但开源。

To disassemble, open a binary, then press F6 and then select elf/image.

要反汇编,打开一个二进制文件,然后按 F6,然后选择 elf/image。

回答by realkstrawn93

You can come pretty damn close (but no cigar) to generating assembly that will reassemble, if that's what you are intending to do, using this rather crude and tediously long pipeline trick (replace /bin/bash with the file you intend to disassemble and bash.S with what you intend to send the output to):

你可以非常接近(但没有雪茄)生成将重新组装的程序集,如果这是你打算做的,使用这个相当粗糙和冗长乏味的管道技巧(用你打算反汇编的文件替换 /bin/bash 和bash.S 与您打算将输出发送到):

##代码##

Note how long this is, however. I really wish there was a better way (or, for that matter, a disassembler capable of outputting code that an assembler will recognize), but unfortunately there isn't.

但是,请注意这是多长时间。我真的希望有一个更好的方法(或者,就此而言,一个能够输出汇编器将识别的代码的反汇编器),但不幸的是没有。

回答by Picaud Vincent

Let's say that you have:

假设您有:

##代码##

To get assembly code using gcc you can do:

要使用 gcc 获取汇编代码,您可以执行以下操作:

##代码##

c++filtdemangles symbols

c++filt破坏符号

grep -vE '\s+\.'removes some useless information

grep -vE '\s+\.'删除一些无用的信息

Now if you want to visualize the tagged part, simply use:

现在,如果您想可视化标记部分,只需使用:

##代码##

With my computer I get:

用我的电脑我得到:

##代码##

A more friendly approach is to use: Compiler Explorer

更友好的方法是使用:Compiler Explorer