C++ 如何隐藏二进制代码中的字符串?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/1356896/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to hide a string in binary code?
提问by Dmitriy
Sometimes, it is useful to hide a string from a binary (executable) file. For example, it makes sense to hide encryption keys from binaries.
有时,从二进制(可执行)文件中隐藏字符串很有用。例如,对二进制文件隐藏加密密钥是有意义的。
When I say “hide”, I mean making strings harder to find in the compiled binary.
当我说“隐藏”时,我的意思是使字符串在编译后的二进制文件中更难找到。
For example, this code:
例如,这段代码:
const char* encryptionKey = "My strong encryption key";
// Using the key
after compilation produces an executable file with the following in its data section:
编译后生成一个可执行文件,其数据部分包含以下内容:
4D 79 20 73 74 72 6F 6E-67 20 65 6E 63 72 79 70 |My strong encryp|
74 69 6F 6E 20 6B 65 79 |tion key |
You can see that our secret string can be easily found and/or modified.
您可以看到我们的秘密字符串很容易找到和/或修改。
I could hide the string…
我可以隐藏字符串...
char encryptionKey[30];
int n = 0;
encryptionKey[n++] = 'M';
encryptionKey[n++] = 'y';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 's';
encryptionKey[n++] = 't';
encryptionKey[n++] = 'r';
encryptionKey[n++] = 'o';
encryptionKey[n++] = 'n';
encryptionKey[n++] = 'g';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 'e';
encryptionKey[n++] = 'n';
encryptionKey[n++] = 'c';
encryptionKey[n++] = 'r';
encryptionKey[n++] = 'y';
encryptionKey[n++] = 'p';
encryptionKey[n++] = 't';
encryptionKey[n++] = 'i';
encryptionKey[n++] = 'o';
encryptionKey[n++] = 'n';
encryptionKey[n++] = ' ';
encryptionKey[n++] = 'k';
encryptionKey[n++] = 'e';
encryptionKey[n++] = 'y';
…but it's not a nice method. Any better ideas?
……但这不是一个好方法。有什么更好的想法吗?
PS: I know that merely hiding secrets doesn't work against a determined attacker, but it's much better than nothing…
PS:我知道仅仅隐藏秘密对坚定的攻击者不起作用,但总比没有好......
Also, I know about assymetric encryption, but it's not acceptable in this case. I am refactoring an existing appication which uses Blowfish encryption and passes encrypted data to the server (the server decrypts the data with the same key).
另外,我知道非对称加密,但在这种情况下是不可接受的。我正在重构一个现有的应用程序,它使用 Blowfish 加密并将加密数据传递到服务器(服务器使用相同的密钥解密数据)。
I can'tchange the encryption algorithm because I need to provide backward compatibility. I can'teven change the encryption key.
我无法更改加密算法,因为我需要提供向后兼容性。我什至无法更改加密密钥。
采纳答案by Dmitriy
I'm sorry for long answer.
我很抱歉回答很长。
Your answers are absolutely correct, but the question was how to hide string and do it nicely.
你的答案绝对正确,但问题是如何隐藏字符串并做得很好。
I did it in such way:
我是这样做的:
#include "HideString.h"
DEFINE_HIDDEN_STRING(EncryptionKey, 0x7f, ('M')('y')(' ')('s')('t')('r')('o')('n')('g')(' ')('e')('n')('c')('r')('y')('p')('t')('i')('o')('n')(' ')('k')('e')('y'))
DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))
int main()
{
std::cout << GetEncryptionKey() << std::endl;
std::cout << GetEncryptionKey2() << std::endl;
return 0;
}
HideString.h:
隐藏字符串.h:
#include <boost/preprocessor/cat.hpp>
#include <boost/preprocessor/seq/for_each_i.hpp>
#include <boost/preprocessor/seq/enum.hpp>
#define CRYPT_MACRO(r, d, i, elem) ( elem ^ ( d - i ) )
#define DEFINE_HIDDEN_STRING(NAME, SEED, SEQ)\
static const char* BOOST_PP_CAT(Get, NAME)()\
{\
static char data[] = {\
BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ)),\
'BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ))
'\
};\
\
static bool isEncrypted = true;\
if ( isEncrypted )\
{\
for (unsigned i = 0; i < ( sizeof(data) / sizeof(data[0]) ) - 1; ++i)\
{\
data[i] = CRYPT_MACRO(_, SEED, i, data[i]);\
}\
\
isEncrypted = false;\
}\
\
return data;\
}
Most tricky line in HideString.h is:
HideString.h 中最棘手的一行是:
DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))
Lets me explane the line. For code:
让我解释一下这条线。对于代码:
BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ)
( 'T' ^ ( 0x27 - 0 ) ) ( 'e' ^ ( 0x27 - 1 ) ) ( 's' ^ ( 0x27 - 2 ) ) ( 't' ^ ( 0x27 - 3 ) )
生成序列:BOOST_PP_SEQ_ENUM(BOOST_PP_SEQ_FOR_EACH_I(CRYPT_MACRO, SEED, SEQ))
'T' ^ ( 0x27 - 0 ), 'e' ^ ( 0x27 - 1 ), 's' ^ ( 0x27 - 2 ), 't' ^ ( 0x27 - 3 )
产生:DEFINE_HIDDEN_STRING(EncryptionKey2, 0x27, ('T')('e')('s')('t'))
and finally,
最后,
static const char* GetEncryptionKey2()
{
static char data[] = {
'T' ^ ( 0x27 - 0 ), 'e' ^ ( 0x27 - 1 ), 's' ^ ( 0x27 - 2 ), 't' ^ ( 0x27 - 3 ),
'0x00B0200C 32 07 5d 0f 0f 08 16 16 10 56 10 1a 10 00 08 2.]......V.....
0x00B0201B 00 1b 07 02 02 4b 01 0c 11 00 00 00 00 00 00 .....K.........
'
};
static bool isEncrypted = true;
if ( isEncrypted )
{
for (unsigned i = 0; i < ( sizeof(data) / sizeof(data[0]) ) - 1; ++i)
{
data[i] = ( data[i] ^ ( 0x27 - i ) );
}
isEncrypted = false;
}
return data;
}
产生:class Alpha : public std::string
{
public:
Alpha(string str)
{
std::string phrase(str.c_str(), str.length());
this->assign(phrase);
}
Alpha c(char c) {
std::string phrase(this->c_str(), this->length());
phrase += c;
this->assign(phrase);
return *this;
}
};
data for "My strong encryption key" looks like:
“我的强加密密钥”的数据如下所示:
Alpha str("");
string myStr = str.c('T').c('e').c('s').c('t');
Thank you very much for your answers!
非常感谢您的回答!
回答by Adam Liss
As noted in the comment to pavium's answer, you have two choices:
正如对 pavium's answer的评论中所述,您有两种选择:
- Secure the key
- Secure the decryption algorithm
- 保护密钥
- 保护解密算法
Unfortunately, if you must resort to embedding both the key and the algorithm within the code, neither is truly secret, so you're left with the (far weaker) alternative of security through obscurity. In other words, as you mentioned, you need a clever way to hide either or both of them inside your executable.
不幸的是,如果您必须求助于在代码中嵌入密钥和算法,两者都不是真正的秘密,因此您只能通过 obscurity获得(弱得多的)安全性替代方案。换句话说,正如您所提到的,您需要一种巧妙的方法将它们中的一个或两个隐藏在可执行文件中。
Here are some options, though you need to remember that none of these is truly secureaccording to any cryptographic best practices, and each has its drawbacks:
以下是一些选项,但您需要记住,根据任何加密最佳实践,这些选项都不是真正安全的,并且每个选项都有其缺点:
- Disguise your key as a string that would normally appear within the code.One example would be the format string of a
printf()
statement, which tends to have numbers, letters, and punctuation. - Hashsome or all of the code or data segmentson startup, and use that as the key. (You'll need to be a bit clever about this to ensure the key doesn't change unexpectedly!) This has a potentially desirable side-effect of verifying the hashed portion of your code each time it runs.
- Generate the key at run-timefrom something that is unique to (and constant within) the system for example, by hashing the MAC address of a network adapter.
- Create the key by choosing bytes from other data.If you have static or global data, regardless of type (
int
,char
, etc.), take a byte from somewhere within each variable after it's initialized (to a non-zero value, of course) and before it changes.
- 将您的密钥伪装成通常会出现在代码中的字符串。一个例子是
printf()
语句的格式字符串,它往往包含数字、字母和标点符号。 - 在启动时散列部分或全部代码或数据段,并将其用作密钥。(您需要对此有点聪明,以确保密钥不会意外更改!)这有一个潜在的副作用,即每次运行时验证代码的散列部分。
- 例如,通过散列网络适配器的 MAC 地址,在运行时从系统独有的(并且在系统内保持不变)生成密钥。
- 通过从其他数据中选择字节来创建密钥。如果你有静态或全局数据,而不管类型(
int
,char
,等),它的初始化(为非零值,当然)后,采取从某处一个字节每个变量中,并在改变之前。
Please let us know how you solve the problem!
请告诉我们您是如何解决问题的!
Edit:You commented that you're refactoring existing code, so I'll assume you can't necessarily choose the key yourself. In that case, follow a 2-step process: Use one of the above methods to encrypt the key itself, then use thatkey to decrypt the users' data.
编辑:您评论说您正在重构现有代码,所以我假设您不一定能自己选择密钥。在这种情况下,请遵循 2 步过程:使用上述方法之一对密钥本身进行加密,然后使用该密钥解密用户的数据。
回答by Ken
- Post it as a code golf problem
- Wait for a solution written in J
- Embed a J interpreter in your app
- 将其作为代码高尔夫问题发布
- 等待用 J 编写的解决方案
- 在您的应用中嵌入 J 解释器
回答by Stephen C
Hiding passwords in your code is security by obscurity. This is harmful because makes you think you have some level of protection, when in fact you have very little. If something is worth securing, it is worth securing properly.
在你的代码中隐藏密码是一种默默无闻的安全。这是有害的,因为让您认为您有一定程度的保护,而实际上您只有很少的保护。如果某样东西值得保护,那么它就值得正确保护。
PS: I know that it doesn't work against real hacker, but it's much better than nothing...
PS:我知道它对真正的黑客不起作用,但总比没有好...
Actually, in a lot of situations nothing is better than weak security. At least you know exactly where you stand. You don't need to be a "real hacker" to circumvent an embedded password ...
实际上,在很多情况下,没有什么比弱安全性更好的了。至少你确切地知道你的立场。您无需成为“真正的黑客”即可绕过嵌入式密码......
EDIT: Responding to this comment:
编辑:回应此评论:
I know about pairs of keys, but it not acceptable in this case. I refactoring existing appication which uses Blowfish encryption. Encrypted data passed to server and server decrypt data. I can't change ecryption algorithm because I should provide backward compatibility.
我知道密钥对,但在这种情况下是不可接受的。我重构了使用 Blowfish 加密的现有应用程序。加密数据传递给服务器,服务器解密数据。我不能改变 ecryption 算法,因为我应该提供向后兼容性。
If you care about security at all, maintaining backwards compatibility is a REALLY BAD reason to leave yourself vulnerable with embedded passwords. It is a GOOD THING to break backwards compatibility with an insecure security scheme.
如果您完全关心安全性,那么保持向后兼容性是让自己容易受到嵌入式密码攻击的一个非常糟糕的理由。打破与不安全安全方案的向后兼容性是一件好事。
It is like when the street kids discover that you leave your front door key under the mat, but you keep doing it because grandpa expects to find it there.
就像街头的孩子们发现你把前门钥匙放在垫子下面,但你继续这样做,因为爷爷希望在那里找到它。
回答by T.J. Crowder
Your example doesn't hide the string at all; the string is still presented as a series of characters in the output.
您的示例根本没有隐藏字符串;该字符串在输出中仍显示为一系列字符。
There are a variety of ways you can obfuscate strings. There's the simple substitution cypher, or you might perform a mathematical operation on each character (an XOR, for instance) where the result feeds into the next character's operation, etc., etc.
有多种方法可以混淆字符串。有一个简单的替换 cypher,或者您可以对每个字符(例如 XOR)执行数学运算,其中结果输入到下一个字符的运算中,等等。
The goal would be to end up with data that doesn't look like a string, so for example if you're working in most western languages, most of your character values will be in the range 32-127 — so your goal would be for the operation to mostly put them mostly outof that range, so they don't draw attention.
目标是最终得到看起来不像字符串的数据,例如,如果您使用的是大多数西方语言,那么您的大部分字符值都将在 32-127 范围内——所以您的目标是操作主要将它们大部分放在该范围之外,因此它们不会引起注意。
回答by Wim ten Brink
This is as secure as leaving your bike unlocked in Amsterdam, the Netherlands near Central Station. (Blink, and it's gone!)
这就像在荷兰阿姆斯特丹中央车站附近不锁自行车一样安全。(眨眼,就不见了!)
If you're trying to add security to your application then you're doomed to fail from the start since any protection scheme will fail. All you can do is make it more complex for a hacker to find the information he needs. Still, a few tricks:
如果您试图为应用程序增加安全性,那么您从一开始就注定要失败,因为任何保护方案都会失败。你所能做的就是让黑客更复杂地找到他需要的信息。不过还是有几个小技巧:
*) Make sure the string is stored as UTF-16 in your binary.
*) 确保字符串以 UTF-16 格式存储在您的二进制文件中。
*) Add numbers and special characters to the string.
*) 向字符串添加数字和特殊字符。
*) Use an array of 32-bits integers instead of a string! Convert each to a string and concatenate them all.
*) 使用 32 位整数数组而不是字符串!将每个转换为字符串并将它们全部连接起来。
*) Use a GUID, store it as binary and convert it to a string to use.
*) 使用 GUID,将其存储为二进制并将其转换为要使用的字符串。
And if you really need some pre-defined text, encrypt it and store the encrypted value in your binary. Decrypt it in runtime where the key to decrypt is one of the options I've mentioned before.
如果你真的需要一些预定义的文本,加密它并将加密的值存储在你的二进制文件中。在运行时解密它,其中解密密钥是我之前提到的选项之一。
Do realize that hackers will tend to crack your application in other ways than this. Even an expert at cryptography will not be able to keep something safe. In general, the only thing that protects you is the profit a hacker can gain from hacking your code, compared to the cost of hacking it. (These costs would often be just a lot of time, but if it takes a week to hack your application and just 2 days to hack something else, something else is more likely to be attacked.)
请注意,黑客往往会以除此之外的其他方式破解您的应用程序。即使是密码学专家也无法保证安全。一般来说,与黑客攻击的成本相比,唯一能保护您的是黑客从黑客攻击您的代码中获得的利润。(这些成本通常只是很多时间,但如果破解你的应用程序需要一周时间,而破解其他东西只需要 2 天,那么其他东西更有可能受到攻击。)
回复评论:UTF-16 将是每个字符两个字节,因此对于查看二进制转储的用户来说更难识别,仅仅是因为每个字母之间有一个额外的字节。不过,你仍然可以看到这些词。UTF-32 甚至会更好,因为它在字母之间增加了更多的空间。然后,您还可以通过更改为每字符 6 位方案来稍微压缩文本。然后每 4 个字符将压缩为三个数字。但这会将您限制为 2x26 个字母、10 个数字,也许还有空格和点以达到 64 个字符。
The use of a GUIDis practical if you store the GUID in it's binary format, not it's textual format. A GUID is 16 bytes long and can be randomly generated. Thus it's difficult to guess the GUID that's used as password. But if you still need to send plain text over, a GUID could be converted to a string representation to be something like "3F2504E0-4F89-11D3-9A0C-0305E82C3301". (Or Base64-encoded as "7QDBkvCA1+B9K/U0vrQx1A==".) But users won't see any plain text in the code, just some apparently random data. Not all bytes in a GUID are random, though. There's a version number hidden in GUIDs. Using a GUID isn't the best option for cryptographic purposes, though. It's either calculated based on your MAC address or by a pseudo-random number, making it reasonable predictable. Still, it's easy to create and easy to store, convert and use. Creating something longer doesn't add more value since a hacker would just try to find other tricks to crack the security. It's just a question about how willing they are to invest more time into analyzing the binaries.
GUID的使用如果您以二进制格式而不是文本格式存储 GUID,则是实用的。GUID 有 16 个字节长,可以随机生成。因此很难猜测用作密码的 GUID。但是,如果您仍然需要发送纯文本,则可以将 GUID 转换为字符串表示形式,类似于“3F2504E0-4F89-11D3-9A0C-0305E82C3301”。(或 Base64 编码为“7QDBkvCA1+B9K/U0vrQx1A==”。)但用户不会在代码中看到任何纯文本,只是一些明显随机的数据。不过,并非 GUID 中的所有字节都是随机的。GUID 中隐藏了一个版本号。不过,使用 GUID 并不是加密目的的最佳选择。它要么是根据您的 MAC 地址计算的,要么是根据伪随机数计算的,因此可以合理地预测。还是它' 易于创建且易于存储、转换和使用。创造更长的时间并不会增加更多价值,因为黑客只会尝试寻找其他技巧来破解安全性。这只是一个关于他们是否愿意投入更多时间来分析二进制文件的问题。
In general, the most important thing that keeps your applications safe is the number of people who are interested in it. If no one cares about your application then no one will bother to hack it either. When you're the top product with 500 million users, then your application is cracked within an hour.
一般来说,保证应用程序安全的最重要的事情是对它感兴趣的人数。如果没有人关心您的应用程序,那么也没有人会费心去破解它。当您是拥有 5 亿用户的顶级产品时,您的应用程序将在一个小时内被破解。
回答by mafonya
For C check this out: https://github.com/mafonya/c_hide_strings
对于 C,请查看:https: //github.com/mafonya/c_hide_strings
For C++ this:
对于 C++ 这个:
##代码##In order to use this, just include Alpha and:
为了使用它,只需包含 Alpha 和:
##代码##So mystr is "Test" now and the string is hidden from strings table in binary.
所以 mystr 现在是“测试”,并且该字符串以二进制形式从字符串表中隐藏。
回答by Michael Haephrati
You can use a c++ libraryI have developed for that purpose. Another articlewhich is much simpler to implement, won as the best c++ article of September 2017. For a more simple way to hide strings, see TinyObfuscate.
您可以使用我为此目的开发的C++ 库。另一篇实现起来更简单的文章,获得了 2017 年 9 月最佳 c++ 文章。 更简单的隐藏字符串的方法,请参见TinyObfuscate。
回答by Corin
I was once in a similarly awkward position. I had data that needed to be in the binary but not in plain text. My solution was to encrypt the data using a very simple scheme that made it look like the rest of the program. I encrypted it by writing a program that took a string, converted all the characters to the ASCII code (padded with zeros as necessary to get a three digit number) and then added a random digit to the beginning and the end of the 3 digit code. Thus each character of the string was represented by 5 characters (all numbers) in the encrypted string. I pasted that string into the application as a constant and then when I needed to use the string, I decrypted and stored the result in a variable just long enough to do what I needed to.
我曾经处于同样尴尬的境地。我有需要在二进制中但不是纯文本的数据。我的解决方案是使用一种非常简单的方案加密数据,使其看起来像程序的其余部分。我通过编写一个带有字符串的程序对其进行加密,将所有字符转换为 ASCII 代码(根据需要填充零以获得三位数),然后在 3 位数代码的开头和结尾添加一个随机数字. 因此,字符串的每个字符在加密字符串中由 5 个字符(所有数字)表示。我将该字符串作为常量粘贴到应用程序中,然后当我需要使用该字符串时,我将结果解密并将其存储在一个变量中,该变量的时间刚好足以执行我需要的操作。
So to use your example, "My strong encryption key" becomes "207719121310329211541116181145111157110071030703283101101109309926114151216611289116161056811109110470321510787101511213". Then when you need your encryption key, decode it but undoing the process.
因此,要使用你的榜样,“我有很强的加密密钥”变为“207719121310329211541116181145111157110071030703283101101109309926114151216611289116161056811109110470321510787101511213”。然后,当您需要加密密钥时,对其进行解码但撤消该过程。
It's certainly not bulletproof but I wasn't aiming for that.
它当然不是防弹的,但我不是针对那个。
回答by MSalters
It's a client-server application! Don't store it in the client itself, that's the place where hackers will obviously look. Instead, add (for your new client only) an extra server function (over HTTPS) to retrieve this password. Thus this password should never hit the client disk.
这是一个客户端 - 服务器应用程序!不要将它存储在客户端本身,那是黑客显然会寻找的地方。相反,添加(仅适用于您的新客户端)额外的服务器功能(通过 HTTPS)来检索此密码。因此,这个密码永远不会到达客户端磁盘。
As a bonus, it becomes a lot easier to fix the server later. Just send a different, per-client time-limited password every time. Don't forget to allow for longer passwords in your new client.
作为奖励,以后修复服务器变得容易多了。每次只需发送不同的、每个客户端的限时密码。不要忘记在新客户端中允许更长的密码。