C语言如何用C编写自修改代码？

Question

提问by AnkurVj

I want to write a piece of code that changes itself continuously, even if the change is insignificant.

我想写一段自己不断变化的代码，即使变化是微不足道的。

For example maybe something like

例如，也许像

for i in 1 to  100, do 
begin
   x := 200
   for j in 200 downto 1, do
    begin
       do something
    end
end

Suppose I want that my code should after first iteration change the line x := 200to some other line x := 199and then after next iteration change it to x := 198and so on.

假设我希望我的代码应该在第一次迭代后将该行更改为x := 200其他行x := 199，然后在下一次迭代后将其更改为x := 198等等。

Is writing such a code possible ? Would I need to use inline assembly for that ?

写这样的代码可能吗？我需要为此使用内联汇编吗？

EDIT : Here is why I want to do it in C:

编辑：这就是为什么我想在 C 中做到这一点：

This program will be run on an experimental operating system and I can't / don't know how to use programs compiled from other languages. The real reason I need such a code is because this code is being run on a guest operating system on a virtual machine. The hypervisor is a binary translator that is translating chunks of code. The translator does some optimizations. It only translates the chunks of code once. The next time the same chunk is used in the guest, the translator will use the previously translated result. Now, if the code gets modified on the fly, then the translator notices that, and marks its previous translation as stale. Thus forcing a re-translation of the same code. This is what I want to achieve, to force the translator to do many translations. Typically these chunks are instructions between to branch instructions (such as jump instructions). I just think that self modifying code would be fantastic way to achieve this.

该程序将在实验操作系统上运行，我不能/不知道如何使用从其他语言编译的程序。我需要这样的代码的真正原因是因为该代码正在虚拟机上的来宾操作系统上运行。管理程序是一个二进制翻译器，用于翻译代码块。翻译器做了一些优化。它只翻译代码块一次。下次在来宾中使用相同的块时，翻译器将使用先前翻译的结果。现在，如果代码被动态修改，则翻译器会注意到这一点，并将其先前的翻译标记为过时。从而强制重新翻译相同的代码。这就是我想要实现的，强迫翻译者做很多翻译。通常，这些块是分支指令（例如跳转指令）之间的指令。我只是认为自我修改代码是实现这一目标的绝佳方式。

Answer 1

回答by Heath Hunnicutt

You might want to consider writing a virtual machine in C, where you can build your own self-modifying code.

您可能需要考虑用 C 编写虚拟机，您可以在其中构建自己的自修改代码。

If you wish to write self-modifying executables, much depends on the operating system you are targeting. You might approach your desired solution by modifying the in-memory program image. To do so, you would obtain the in-memory address of your program's code bytes. Then, you might manipulate the operating system protection on this memory range, allowing you to modify the bytes without encountering an Access Violation or '''SIG_SEGV'''. Finally, you would use pointers (perhaps '''unsigned char *''' pointers, possibly '''unsigned long *''' as on RISC machines) to modify the opcodes of the compiled program.

如果您希望编写自修改可执行文件，很大程度上取决于您的目标操作系统。您可以通过修改内存中的程序映像来获得所需的解决方案。为此，您将获得程序代码字节的内存地址。然后，您可以在此内存范围上操作操作系统保护，允许您修改字节而不会遇到访问冲突或 '''SIG_SEGV'''。最后，您将使用指针（可能是 '''unsigned char *''' 指针，在 RISC 机器上可能是 '''unsigned long *'''）来修改编译程序的操作码。

A key point is that you will be modifying machine code of the target architecture. There is no canonical format for C code while it is running -- C is a specification of a textual input file to a compiler.

一个关键点是您将修改目标架构的机器代码。C 代码在运行时没有规范格式——C 是编译器的文本输入文件的规范。

Answer 2

回答by Labo

Sorry, I am answering a bit late, but I think I found exactly what you are looking for : https://shanetully.com/2013/12/writing-a-self-mutating-x86_64-c-program/

抱歉，我的回答有点晚，但我想我找到了您正在寻找的内容：https: //shanetully.com/2013/12/writing-a-self-mutating-x86_64-c-program/

In this article, they change the value of a constant by injecting assembly in the stack. Then they execute a shellcode by modifying the memory of a function on the stack.

在本文中，他们通过在堆栈中注入程序集来更改常量的值。然后他们通过修改堆栈上函数的内存来执行shellcode。

Below is the first code :

下面是第一个代码：

#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <sys/mman.h>

void foo(void);
int change_page_permissions_of_address(void *addr);

int main(void) {
    void *foo_addr = (void*)foo;

    // Change the permissions of the page that contains foo() to read, write, and execute
    // This assumes that foo() is fully contained by a single page
    if(change_page_permissions_of_address(foo_addr) == -1) {
        fprintf(stderr, "Error while changing page permissions of foo(): %s\n", strerror(errno));
        return 1;
    }

    // Call the unmodified foo()
    puts("Calling foo...");
    foo();

    // Change the immediate value in the addl instruction in foo() to 42
    unsigned char *instruction = (unsigned char*)foo_addr + 18;
    *instruction = 0x2A;

    // Call the modified foo()
    puts("Calling foo...");
    foo();

    return 0;
}

void foo(void) {
    int i=0;
    i++;
    printf("i: %d\n", i);
}

int change_page_permissions_of_address(void *addr) {
    // Move the pointer to the page boundary
    int page_size = getpagesize();
    addr -= (unsigned long)addr % page_size;

    if(mprotect(addr, page_size, PROT_READ | PROT_WRITE | PROT_EXEC) == -1) {
        return -1;
    }

    return 0;
}

Answer 3

回答by Vatine

It is possible, but it's most probably not portably possible and you may have to contend with read-only memory segments for the running code and other obstacles put in place by your OS.

这是可能的，但它很可能不是可移植的，您可能不得不为运行代码和操作系统设置的其他障碍处理只读内存段。

Answer 4

回答by CreativeJourney

This would be a good start. Essentially Lisp functionality in C:

这将是一个好的开始。本质上是 C 语言中的 Lisp 功能：

http://nakkaya.com/2010/08/24/a-micro-manual-for-lisp-implemented-in-c/

Answer 5

回答by Pillsy

Depending on how much freedom you need, you may be able to accomplish what you want by using function pointers. Using your pseudocode as a jumping-off point, consider the case where we want to modify that variable xin different ways as the loop index ichanges. We could do something like this:

根据您需要多少自由，您可以通过使用函数指针来完成您想要的操作。使用您的伪代码作为起点，考虑我们希望x在循环索引i更改时以不同方式修改该变量的情况。我们可以这样做：

#include <stdio.h>

void multiply_x (int * x, int multiplier)
{
    *x *= multiplier;
}

void add_to_x (int * x, int increment)
{
    *x += increment;
}

int main (void)
{
    int x = 0;
    int i;

    void (*fp)(int *, int);

    for (i = 1; i < 6; ++i) {
            fp = (i % 2) ? add_to_x : multiply_x;

            fp(&x, i);

            printf("%d\n", x);
    }

    return 0;
}

The output, when we compile and run the program, is:

当我们编译和运行程序时，输出是：

Obviously, this will only work if you have finite number of things you want to do with xon each run through. In order to make the changes persistent (which is part of what you want from "self-modification"), you would want to make the function-pointer variable either global or static. I'm not sure I really can recommend this approach, because there are often simpler and clearer ways of accomplishing this sort of thing.

显然，这只有x在每次运行时您想要做的事情数量有限时才有效。为了使更改持久化（这是您想要的“自我修改”的一部分），您需要使函数指针变量成为全局变量或静态变量。我不确定我是否真的可以推荐这种方法，因为通常有更简单、更清晰的方法来完成这类事情。

Answer 6

回答by Daniel Papasian

The suggestion about implementing LISP in C and then using that is solid, due to portability concerns. But if you really wanted to, this could also be implemented in the other direction on many systems, by loading your program's bytecode into memory and then returning to it.

由于可移植性问题，关于在 C 中实现 LISP 然后使用它的建议是可靠的。但是如果你真的想要，这也可以在许多系统上以另一个方向实现，通过将程序的字节码加载到内存中，然后返回到它。

There's a couple of ways you could attempt to do that. One way is via a buffer overflow exploit. Another would be to use mprotect() to make the code section writable, and then modify compiler-created functions.

有几种方法可以尝试做到这一点。一种方法是通过缓冲区溢出漏洞利用。另一种方法是使用 mprotect() 使代码部分可写，然后修改编译器创建的函数。

Techniques like this are fun for programming challenges and obfuscated competitions, but given how unreadable your code would be combined with the fact you're exploiting what C considers undefined behavior, they're best avoided in production environments.

像这样的技术对于编程挑战和混乱的比赛来说很有趣，但是考虑到您的代码的不可读性以及您正在利用 C 认为未定义的行为的事实，最好在生产环境中避免使用它们。

Answer 7

回答by Jonathan M

A self-interpreting language (not hard-compiled and linked like C) might be better for that. Perl, javascript, PHP have the evil eval()function that might be suited to your purpose. By it, you could have a string of code that you constantly modify and then execute via eval().

一种自我解释的语言（不是像 C 那样硬编译和链接）可能会更好。Perl、javascript、PHP 具有eval()可能适合您的目的的邪恶功能。通过它，您可以拥有一串代码，您可以不断修改这些代码，然后通过eval().

Answer 8

回答by Zachary Canann

My friend and I encountered this problem while working on a game that self-modifies it's code. We allow the user to rewrite code snippets in x86 assembly.

我和我的朋友在开发一款自行修改代码的游戏时遇到了这个问题。我们允许用户在 x86 程序集中重写代码片段。

This just requires leveraging two libraries -- an assembler, and a disassembler:

这只需要利用两个库——一个汇编器和一个反汇编器：

FASM assembler https://github.com/ZenLulz/Fasm.NET

FASM 汇编程序https://github.com/ZenLulz/Fasm.NET

UDIS86 disassembler: https://github.com/vmt/udis86

UDIS86 反汇编器：https: //github.com/vmt/udis86

We read instructions using the disassembler, let the user edit them, convert the new instructions to bytes with the assembler, and write them back to memory. The write-back requires using VirtualProtecton windows to change page permissions to allow editing the code. On Unix you have to use mprotectinstead.

我们使用反汇编器读取指令，让用户编辑它们，使用汇编器将新指令转换为字节，然后将它们写回内存。回写需要VirtualProtect在 Windows 上使用来更改页面权限以允许编辑代码。在 Unix 上，您必须mprotect改用。

I posted an article here on how we did it:

我在这里发布了一篇关于我们如何做到的文章：

https://medium.com/squallygame/how-we-wrote-a-self-hacking-game-in-c-d8b9f97bfa99

As well as the sample code here:

以及这里的示例代码：

https://github.com/Squalr/SelfHackingApp

These examples are on Windows using C++, but it should be very easy to make cross-platform and C only.

这些示例在使用 C++ 的 Windows 上进行，但是跨平台和仅使用 C 应该很容易。

Answer 9

回答by Basile Starynkevitch

In standard C11 (read n1570), you cannot write self modifying code(at least without undefined behavior). Conceptually at least, the code segmentis read-only.

在标准 C11（阅读n1570）中，您不能编写自修改代码（至少没有未定义的行为）。至少在概念上，代码段是只读的。

You might consider extending the code of your program with pluginsusing your dynamic linker. This require operating system specific functions. On POSIX, use dlopen(and probably dlsymto get newly loaded function pointers). You could then overwrite function pointers with the address of new ones.

您可以考虑使用动态链接器通过插件扩展程序代码。这需要操作系统特定的功能。在 POSIX 上，使用dlopen（可能还有dlsym来获取新加载的函数指针）。然后你可以用新的地址覆盖函数指针。

Perhaps you could use some JIT-compilinglibrary (like libgccjitor asmjit) to achieve your goals. You'll get fresh function addresses and put them in your function pointers.

也许您可以使用一些JIT 编译库（如libgccjit或asmjit）来实现您的目标。您将获得新的函数地址并将它们放入您的函数指针中。

Remember that a C compiler can generate code of various size for a given function call or jump, so even overwriting that in a machine specific way is brittle.

请记住，C 编译器可以为给定的函数调用或跳转生成各种大小的代码，因此即使以特定于机器的方式覆盖它也是脆弱的。

C语言如何用C编写自修改代码？

提问by AnkurVj

回答by Heath Hunnicutt

回答by Labo

回答by Vatine

回答by CreativeJourney

回答by Pillsy

回答by Daniel Papasian

回答by Jonathan M

回答by Zachary Canann

回答by Basile Starynkevitch

相关推荐

最近更新

标签

C语言 如何用C编写自修改代码？

提问by AnkurVj

回答by Heath Hunnicutt

回答by Labo

回答by Vatine

回答by CreativeJourney

回答by Pillsy

回答by Daniel Papasian

回答by Jonathan M

回答by Zachary Canann

回答by Basile Starynkevitch

相关推荐

C语言 具有未知大小结构数组的结构

C语言 fork() 子进程和父进程

C语言 在链表中添加节点时使用双指针的原因是什么？

C语言 警告：忽略使用属性 warn_unused_result 声明的“scanf”的返回值

相关推荐

最近更新

标签

C语言如何用C编写自修改代码？

C语言具有未知大小结构数组的结构

C语言在链表中添加节点时使用双指针的原因是什么？

C语言警告：忽略使用属性 warn_unused_result 声明的“scanf”的返回值