如何有效地混淆 Python 代码?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3344115/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to obfuscate Python code effectively?
提问by str1k3r
I am looking for how to hide my Python source code.
我正在寻找如何隐藏我的 Python 源代码。
print "Hello World!"
How can I encode this example so that it isn't human-readable? I've been told to use base64 but I'm not sure how.
我怎样才能编码这个例子,使它不是人类可读的?有人告诉我使用 base64,但我不确定如何使用。
回答by Dave Webb
You can use the base64moduleto encode strings to stop shoulder surfing, but it's not going to stop someone finding your code if they have access to your files.
您可以使用该base64模块对字符串进行编码以停止肩膀冲浪,但如果有人可以访问您的文件,则不会阻止他们找到您的代码。
You can then use the compile()functionand the eval()functionto execute your code once you've decoded it.
然后,您可以在解码后使用该compile()函数和eval()函数来执行您的代码。
>>> import base64
>>> mycode = "print 'Hello World!'"
>>> secret = base64.b64encode(mycode)
>>> secret
'cHJpbnQgJ2hlbGxvIFdvcmxkICEn'
>>> mydecode = base64.b64decode(secret)
>>> eval(compile(mydecode,'<string>','exec'))
Hello World!
So if you have 30 lines of code you'll probably want to encrypt it doing something like this:
因此,如果您有 30 行代码,您可能希望通过以下方式对其进行加密:
>>> f = open('myscript.py')
>>> encoded = base64.b64encode(f.read())
You'd then need to write a second script that does the compile()and eval()which would probably include the encoded script as a string literal encased in triple quotes. So it would look something like this:
然后,您需要编写第二个脚本来执行compile()和eval(),其中可能将编码脚本包含为包含在三重引号中的字符串文字。所以它看起来像这样:
import base64
myscript = """IyBUaGlzIGlzIGEgc2FtcGxlIFB5d
GhvbiBzY3JpcHQKcHJpbnQgIkhlbG
xvIiwKcHJpbnQgIldvcmxkISIK"""
eval(compile(base64.b64decode(myscript),'<string>','exec'))
回答by Daniel Kluev
so that it isn't human-readable?
i mean all the file is encoded !! when you open it you can't understand anything .. ! that what i want
所以它不是人类可读的?
我的意思是所有的文件都被编码了!!当你打开它时,你什么也看不懂..!那是我想要的
As maximum, you can compile your sources into bytecode and then distribute only bytecode. But even this is reversible. Bytecode can be decompiled into semi-readable sources.
最大程度地,您可以将源代码编译为字节码,然后仅分发字节码。但这也是可逆的。字节码可以反编译成半可读的源代码。
Base64 is trivial to decode for anyone, so it cannot serve as actual protection and will 'hide' sources only from complete PC illiterates. Moreover, if you plan to actually run that code by any means, you would have to include decoder right into the script (or another script in your distribution, which would needed to be run by legitimate user), and that would immediately give away your encoding/encryption.
Base64 对任何人来说都是微不足道的,因此它不能作为实际保护,并且只会“隐藏”来源,让完全不识字的 PC 识破。此外,如果您打算以任何方式实际运行该代码,则必须将解码器直接包含在脚本中(或您的发行版中的另一个脚本,需要由合法用户运行),这将立即放弃您的编码/加密。
Obfuscation techniques usually involve comments/docs stripping, name mangling, trash code insertion, and so on, so even if you decompile bytecode, you get not very readable sources. But they will be Python sources nevertheless and Python is not good at becoming unreadable mess.
混淆技术通常涉及注释/文档剥离、名称修改、垃圾代码插入等,因此即使您反编译字节码,也不会得到可读性很强的源代码。但是它们仍然是 Python 源代码,而且 Python 不擅长变成难以阅读的混乱。
If you absolutely need to protect some functionality, I'd suggest going with compiled languages, like C or C++, compiling and distributing .so/.dll, and then using Python bindings to protected code.
如果您绝对需要保护某些功能,我建议您使用编译语言,如 C 或 C++,编译和分发 .so/.dll,然后使用 Python 绑定到受保护的代码。
回答by Broam
As other answers have stated, there really just isn't a way that's any good. Base64 can be decoded. Bytecode can be decompiled. Python was initially just interpreted, and most interpreted languages try to speed up machine interpretation more than make it difficult for human interpretation.
正如其他答案所述,真的没有任何好的方法。Base64 可以解码。字节码可以反编译。Python 最初只是解释性的,大多数解释性语言都试图加快机器解释的速度,而不是让人类解释变得困难。
Python was made to be readable and shareable, not obfuscated. The language decisions about how code has to be formatted were to promote readability across different authors.
Python 被设计成可读和可共享的,而不是混淆的。关于如何格式化代码的语言决定是为了提高不同作者的可读性。
Obfuscating python code just doesn't really mesh with the language. Re-evaluate your reasons for obfuscating the code.
混淆 python 代码并没有真正与语言相结合。重新评估混淆代码的原因。
回答by krs1
Maybe you should look into using something simple like a truecrypt volumefor source code storage as that seems to be a concern of yours. You can create an encrypted file on a usb key or just encrypt the whole volume (provided the code will fit) so you can simply take the key with you at the end of the day.
也许您应该考虑使用像truecrypt 卷这样简单的东西来存储源代码,因为这似乎是您关心的问题。您可以在 USB 密钥上创建一个加密文件,或者只加密整个卷(前提是代码适合),这样您就可以在一天结束时随身携带密钥。
To compile, you could then use something like PyInstalleror py2exein order to create a stand-alone executable. If you really wanted to go the extra mile, look into a packer or compression utilityin order to add more obfuscation. If none of these are an option, you could at least compile the script into bytecode so it isn't immediately readable. Keep in mind that these methods will merely slow someone trying to debug or decompile your program.
要进行编译,您可以使用PyInstaller或py2exe 之类的工具来创建独立的可执行文件。如果您真的想加倍努力,请查看打包程序或压缩实用程序以添加更多混淆。如果这些都不是一个选项,您至少可以将脚本编译成字节码,这样它就不会立即可读。请记住,这些方法只会减慢试图调试或反编译程序的人的速度。
回答by fortran
I'll write my answer in a didactic manner...
我会以说教的方式写下我的答案......
First type into your Python interpreter:
首先在你的 Python 解释器中输入:
import this
then, go and take a look to the file this.pyin your Lib directory within your Python distribution and try to understand what it does.
然后,this.py查看 Python 发行版中 Lib 目录中的文件,并尝试了解它的作用。
After that, take a look to the evalfunction in the documentation:
之后,查看eval文档中的函数:
help(eval)
Now you should have found a funny way to protect your code. But beware, because that only works for people that are less intelligent than you! (and I'm not trying to be offensive, anyone smart enough to understand what you did could reverse it).
现在您应该已经找到了一种有趣的方法来保护您的代码。但要小心,因为这只适用于比你聪明的人!(而且我并不是要冒犯他人,任何聪明到可以理解您所做的事情的人都可以扭转它)。
回答by Eric O Lebigot
This is only a limited, first-level obfuscation solution, but it is built-in: Python has a compiler to byte-code:
这只是一个有限的一级混淆解决方案,但它是内置的:Python 有一个字节码编译器:
python -OO -m py_compile <your program.py>
produces a .pyofile that contains byte-code, and where docstrings are removed, etc. You can rename the .pyofile with a .pyextension, and python <your program.py>runs like your program but does not contain your source code.
生成一个.pyo包含字节码的文件,并在其中删除文档字符串等。您可以.pyo使用.py扩展名重命名该文件,并python <your program.py>像您的程序一样运行但不包含您的源代码。
PS: the "limited" level of obfuscation that you get is such that one can recover the code (with some of the variable names, but without comments and docstrings). See the first comment, for how to do it. However, in some cases, this level of obfuscation might be deemed sufficient.
PS:您获得的“有限”混淆级别可以恢复代码(使用一些变量名称,但没有注释和文档字符串)。请参阅第一条评论,了解如何操作。但是,在某些情况下,这种混淆程度可能就足够了。
PPS: If your program imports modules obfuscated like this, then you need to rename them with a .pycsuffix instead (I'm not sure this won't break one day), or you can work with the .pyoand run them with python -O ….pyo(the imports should work). This will allow Python to find your modules (otherwise, Python looks for .pymodules).
PPS:如果您的程序导入像这样混淆的模块,那么您需要使用.pyc后缀重命名它们(我不确定这不会有一天会中断),或者您可以使用.pyo并运行它们python -O ….pyo(导入应该工作)。这将允许 Python 找到您的模块(否则,Python 会查找.py模块)。
回答by 7h3rAm
I recently stumbled across this blogpost: Python Source Obfuscation using ASTswhere the author talks about python source file obfuscation using the builtin AST module. The compiled binary was to be used for the HitB CTF and as such had strict obfuscation requirements.
我最近偶然发现了这篇博文:Python Source Obfuscation using AST,其中作者谈到了使用内置 AST 模块对 Python 源文件进行混淆。编译后的二进制文件将用于 HitB CTF,因此具有严格的混淆要求。
Since you gain access to individual AST nodes, using this approach allows you to perform arbitrary modifications to the source file. Depending on what transformations you carry out, resulting binary might/might not behave exactly as the non-obfuscated source.
由于您可以访问单个 AST 节点,因此使用此方法可以对源文件执行任意修改。根据您执行的转换,生成的二进制文件可能/可能与非混淆源的行为完全不同。
回答by Cold Diamondz
Well if you want to make a semi-obfuscated code you make code like this:
好吧,如果你想制作一个半混淆的代码,你可以制作这样的代码:
import base64
import zlib
def run(code): exec(zlib.decompress(base64.b16decode(code)))
def enc(code): return base64.b16encode(zlib.compress(code))
and make a file like this (using the above code):
并制作一个这样的文件(使用上面的代码):
f = open('something.py','w')
f.write("code=" + enc("""
print("test program")
print(raw_input("> "))"""))
f.close()
file "something.py":
文件“something.py”:
code = '789CE352008282A2CCBC120DA592D4E212203B3FBD28315749930B215394581E9F9957500A5463A7A0A4A90900ADFB0FF9'
just import "something.py" and run run(something.code)to run the code in the file.
只需导入“something.py”并运行run(something.code)以运行文件中的代码。
One trick is to make the code hard to read by design: never document anything, if you must, just give the output of a function, not how it works. Make variable names very broad, movie references, or opposites example: btmnsfavclr = 16777215where as "btmnsfavclr" means "Batman's Favorite Color" and the value is 16777215or the decimal form of "ffffff" or white. Remember to mix different styles of naming to keep those pesky people of of your code. Also, use tips on this site: Top 11 Tips to Develop Unmaintainable Code.
一个技巧是通过设计使代码难以阅读:永远不要记录任何东西,如果必须的话,只给出函数的输出,而不是它是如何工作的。使变量名称非常广泛,电影引用或对立示例:btmnsfavclr = 16777215其中“ btmnsfavclr”表示“蝙蝠侠最喜欢的颜色”,值是16777215“ ffffff”或白色的十进制形式。记住混合不同风格的命名以保留那些讨厌的人的代码。此外,请使用此站点上的提示:开发不可维护代码的 11 大提示。
回答by GuestHello
I would mask the code like this:
我会像这样屏蔽代码:
def MakeSC():
c = raw_input(" Encode: ")
sc = "\x" + "\x".join("{0:x}".format(ord(c)) for c in c)
print "\n shellcode =('" + sc + "'); exec(shellcode)"; MakeSC();
Cleartext:
明文:
import os; os.system("whoami")
Encoded:
编码:
Payload = ('\x69\x6d\x70\x6f\x72\x74\x20\x6f\x73\x3b\x20\x6f\x73\x2e\x73\x79\x73\x74\x65\x6d\x28\x22\x77\x68\x6f\x61\x6d\x69\x22\x29'); exec(Payload);
回答by user9869932
You could embed your code and compile/run from a C/C++ program. Embedding Python in Another Application
您可以嵌入代码并从 C/C++ 程序编译/运行。 在另一个应用程序中嵌入 Python
embedded.c
嵌入式
#include <Python.h>
int
main(int argc, char *argv[])
{
Py_SetProgramName(argv[0]); /* optional but recommended */
Py_Initialize();
PyRun_SimpleString("print('Hello world !')");
Py_Finalize();
return 0;
}
In ubuntu, debian
在 ubuntu 和 debian 中
$ sudo apt-get install python-dev
In centos, redhat, fedora
在centos、redhat、fedora
$ sudo yum install python-devel
compile with
编译
$ gcc -o embedded -fPIC -I/usr/include/python2.7 -lpython2.7 embedded.c
run with
运行
$ chmod u+x ./embedded
$ time ./embedded
Hello world !
real 0m0.014s
user 0m0.008s
sys 0m0.004s
hello_world.py:
你好世界.py:
print('Hello World !')
run the python script
运行python脚本
$ time python hello_world.py
Hello World !
real 0m0.014s
user 0m0.008s
sys 0m0.004s
though, some strings of the python code may be found in the compiled .c file
但是,在编译的 .c 文件中可能会找到一些 python 代码的字符串
$ grep "Hello" ./embedded
Binary file ./embedded matches
$ grep "Hello World" ./embedded
$
In case you want an extra bit of safety you could use base64 on your code
如果您想要额外的安全性,您可以在代码中使用 base64
...
PyRun_SimpleString("import base64\n"
"base64_code = 'your python code in base64'\n"
"code = base64.b64decode(base64_code)\n"
"exec(code)");
...
e.g:
例如:
create the base 64 string of your code
创建代码的 base 64 字符串
$ base64 hello_world.py
cHJpbnQoJ0hlbGxvIFdvcmxkICEnKQoK
embedded_base64.c
嵌入_base64.c
#include <Python.h>
int
main(int argc, char *argv[])
{
Py_SetProgramName(argv[0]); /* optional but recommended */
Py_Initialize();
PyRun_SimpleString("import base64\n"
"base64_code = 'cHJpbnQoJ0hlbGxvIFdvcmxkICEnKQoK'\n"
"code = base64.b64decode(base64_code)\n"
"exec(code)\n");
Py_Finalize();
return 0;
}
all commands
所有命令
$ gcc -o embedded_base64 -fPIC -I/usr/include/python2.7 -lpython2.7 ./embedded_base64.c
$ chmod u+x ./embedded_base64
$ time ./embedded_base64
Hello World !
real 0m0.014s
user 0m0.008s
sys 0m0.004s
$ grep "Hello" ./embedded_base64
$

