C语言 2006 年混淆 C 代码竞赛。请解释 sykes2.c

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15393441/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 05:41:21  来源:igfitidea点击:

Obfuscated C Code Contest 2006. Please explain sykes2.c

cobfuscationdeobfuscation

提问by corny

How does this C program work?

这个 C 程序是如何工作的?

main(_){_^448&&main(-~_);putchar(--_%64?32|-~7[__TIME__-_/8%8][">'txiZ^(~z?"-48]>>";;;====~$::199"[_*2&8|_/64]/(_&2?1:8)%8&1:10);}

It compiles as it is (tested on gcc 4.6.3). It prints the time when compiled. On my system:

它按原样编译(在 上测试gcc 4.6.3)。它在编译时打印时间。在我的系统上:

    !!  !!!!!!              !!  !!!!!!              !!  !!!!!! 
    !!  !!  !!              !!      !!              !!  !!  !! 
    !!  !!  !!              !!      !!              !!  !!  !! 
    !!  !!!!!!    !!        !!      !!    !!        !!  !!!!!! 
    !!      !!              !!      !!              !!  !!  !! 
    !!      !!              !!      !!              !!  !!  !! 
    !!  !!!!!!              !!      !!              !!  !!!!!!

Source: sykes2 - A clock in one line, sykes2 author hints

来源:sykes2 - 一行时钟sykes2 作者提示

Some hints: No compile warnings per default. Compiled with -Wall, the following warnings are emitted:

一些提示:默认情况下没有编译警告。编译后-Wall,会发出以下警告:

sykes2.c:1:1: warning: return type defaults to ‘int' [-Wreturn-type]
sykes2.c: In function ‘main':
sykes2.c:1:14: warning: value computed is not used [-Wunused-value]
sykes2.c:1:1: warning: implicit declaration of function ‘putchar' [-Wimplicit-function-declaration]
sykes2.c:1:1: warning: suggest parentheses around arithmetic in operand of ‘|' [-Wparentheses]
sykes2.c:1:1: warning: suggest parentheses around arithmetic in operand of ‘|' [-Wparentheses]
sykes2.c:1:1: warning: control reaches end of non-void function [-Wreturn-type]

回答by nneonneo

Let's de-obfuscate it.

让我们去混淆它。

Indenting:

缩进:

main(_) {
    _^448 && main(-~_);
    putchar(--_%64
        ? 32 | -~7[__TIME__-_/8%8][">'txiZ^(~z?"-48] >> ";;;====~$::199"[_*2&8|_/64]/(_&2?1:8)%8&1
        : 10);
}

Introducing variables to untangle this mess:

引入变量来解决这个混乱:

main(int i) {
    if(i^448)
        main(-~i);
    if(--i % 64) {
        char a = -~7[__TIME__-i/8%8][">'txiZ^(~z?"-48];
        char b = a >> ";;;====~$::199"[i*2&8|i/64]/(i&2?1:8)%8;
        putchar(32 | (b & 1));
    } else {
        putchar(10); // newline
    }
}

Note that -~i == i+1because of twos-complement. Therefore, we have

请注意,-~i == i+1由于二进制补码。因此,我们有

main(int i) {
    if(i != 448)
        main(i+1);
    i--;
    if(i % 64 == 0) {
        putchar('\n');
    } else {
        char a = -~7[__TIME__-i/8%8][">'txiZ^(~z?"-48];
        char b = a >> ";;;====~$::199"[i*2&8|i/64]/(i&2?1:8)%8;
        putchar(32 | (b & 1));
    }
}

Now, note that a[b]is the same as b[a], and apply the -~ == 1+change again:

现在,请注意它a[b]与 相同b[a],然后-~ == 1+再次应用更改:

main(int i) {
    if(i != 448)
        main(i+1);
    i--;
    if(i % 64 == 0) {
        putchar('\n');
    } else {
        char a = (">'txiZ^(~z?"-48)[(__TIME__-i/8%8)[7]] + 1;
        char b = a >> ";;;====~$::199"[(i*2&8)|i/64]/(i&2?1:8)%8;
        putchar(32 | (b & 1));
    }
}

Converting the recursion to a loop and sneaking in a bit more simplification:

将递归转换为循环并进一步简化:

// please don't pass any command-line arguments
main() {
    int i;
    for(i=447; i>=0; i--) {
        if(i % 64 == 0) {
            putchar('\n');
        } else {
            char t = __TIME__[7 - i/8%8];
            char a = ">'txiZ^(~z?"[t - 48] + 1;
            int shift = ";;;====~$::199"[(i*2&8) | (i/64)];
            if((i & 2) == 0)
                shift /= 8;
            shift = shift % 8;
            char b = a >> shift;
            putchar(32 | (b & 1));
        }
    }
}

This outputs one character per iteration. Every 64th character, it outputs a newline. Otherwise, it uses a pair of data tables to figure out what to output, and puts either character 32 (a space) or character 33 (a !). The first table (">'txiZ^(~z?") is a set of 10 bitmaps describing the appearance of each character, and the second table (";;;====~$::199") selects the appropriate bit to display from the bitmap.

每次迭代输出一个字符。每 64 个字符,它输出一个换行符。否则,它使用一对数据表来确定要输出的内容,并放置字符 32(空格)或字符 33(a !)。第一个表 ( ">'txiZ^(~z?") 是一组 10 个位图,描述每个字符的外观,第二个表 ( ";;;====~$::199") 从位图中选择合适的位进行显示。

The second table

第二张桌子

Let's start by examining the second table, int shift = ";;;====~$::199"[(i*2&8) | (i/64)];. i/64is the line number (6 to 0) and i*2&8is 8 iff iis 4, 5, 6 or 7 mod 8.

让我们从检查第二个表开始int shift = ";;;====~$::199"[(i*2&8) | (i/64)];i/64是行号(6 到 0)并且i*2&8是 8 如果i是 4、5、6 或 7 mod 8。

if((i & 2) == 0) shift /= 8; shift = shift % 8selects either the high octal digit (for i%8= 0,1,4,5) or the low octal digit (for i%8= 2,3,6,7) of the table value. The shift table ends up looking like this:

if((i & 2) == 0) shift /= 8; shift = shift % 8选择表值的高八进制数字(对于i%8= 0、1、4、5)或低八进制数字(对于= 2、3、6、7)i%8。班次表最终看起来像这样:

row col val
6   6-7 0
6   4-5 0
6   2-3 5
6   0-1 7
5   6-7 1
5   4-5 7
5   2-3 5
5   0-1 7
4   6-7 1
4   4-5 7
4   2-3 5
4   0-1 7
3   6-7 1
3   4-5 6
3   2-3 5
3   0-1 7
2   6-7 2
2   4-5 7
2   2-3 3
2   0-1 7
1   6-7 2
1   4-5 7
1   2-3 3
1   0-1 7
0   6-7 4
0   4-5 4
0   2-3 3
0   0-1 7

or in tabular form

或以表格形式

00005577
11775577
11775577
11665577
22773377
22773377
44443377

Note that the author used the null terminator for the first two table entries (sneaky!).

请注意,作者对前两个表条目使用了空终止符(偷偷摸摸!)。

This is designed after a seven-segment display, with 7s as blanks. So, the entries in the first table must define the segments that get lit up.

这是在七段显示之后设计的,以7s 为空白。因此,第一个表中的条目必须定义点亮的段。

The first table

第一张桌子

__TIME__is a special macro defined by the preprocessor. It expands to a string constant containing the time at which the preprocessor was run, in the form "HH:MM:SS". Observe that it contains exactly 8 characters. Note that 0-9 have ASCII values 48 through 57 and :has ASCII value 58. The output is 64 characters per line, so that leaves 8 characters per character of __TIME__.

__TIME__是预处理器定义的特殊宏。它扩展为包含预处理器运行时间的字符串常量,格式为"HH:MM:SS". 请注意,它正好包含 8 个字符。请注意,0-9 具有 48 到 57 的:ASCII 值,并且具有 58 的 ASCII 值。输出为每行 64 个字符,因此__TIME__.

7 - i/8%8is thus the index of __TIME__that is presently being output (the 7-is needed because we are iterating idownwards). So, tis the character of __TIME__being output.

7 - i/8%8因此__TIME__是目前正在输出的索引(7-因为我们正在i向下迭代,所以需要它)。所以,t__TIME__被输出的特性。

aends up equalling the following in binary, depending on the input t:

a最终等于以下二进制,具体取决于输入t

0 00111111
1 00101000
2 01110101
3 01111001
4 01101010
5 01011011
6 01011111
7 00101001
8 01111111
9 01111011
: 01000000

Each number is a bitmapdescribing the segments that are lit up in our seven-segment display. Since the characters are all 7-bit ASCII, the high bit is always cleared. Thus, 7in the segment table always prints as a blank. The second table looks like this with the 7s as blanks:

每个数字都是一个位图,描述在我们的七段显示器中点亮的段。由于字符都是 7 位 ASCII,所以高位总是被清零。因此,7在段表中总是打印为空白。第二个表看起来像这样,7s 为空白:

000055  
11  55  
11  55  
116655  
22  33  
22  33  
444433  

So, for example, 4is 01101010(bits 1, 3, 5, and 6 set), which prints as

因此,例如,401101010(位 1、3、5 和 6 集),打印为

----!!--
!!--!!--
!!--!!--
!!!!!!--
----!!--
----!!--
----!!--


To show we really understand the code, let's adjust the output a bit with this table:

为了表明我们真正理解了代码,让我们用这张表稍微调整一下输出:

  00  
11  55
11  55
  66  
22  33
22  33
  44

This is encoded as "?;;?==? '::799\x07". For artistic purposes, we'll add 64 to a few of the characters (since only the low 6 bits are used, this won't affect the output); this gives "?{{?}}?gg::799G"(note that the 8th character is unused, so we can actually make it whatever we want). Putting our new table in the original code:

这被编码为"?;;?==? '::799\x07". 出于艺术目的,我们将在一些字符上添加 64(因为只使用了低 6 位,这不会影响输出);这给出了"?{{?}}?gg::799G"(请注意,第 8 个字符未使用,因此我们实际上可以随心所欲地制作它)。将我们的新表放在原始代码中:

main(_){_^448&&main(-~_);putchar(--_%64?32|-~7[__TIME__-_/8%8][">'txiZ^(~z?"-48]>>"?{{?}}?gg::799G"[_*2&8|_/64]/(_&2?1:8)%8&1:10);}

we get

我们得到

          !!              !!                              !!   
    !!  !!              !!  !!  !!  !!              !!  !!  !! 
    !!  !!              !!  !!  !!  !!              !!  !!  !! 
          !!      !!              !!      !!                   
    !!  !!  !!          !!  !!      !!              !!  !!  !! 
    !!  !!  !!          !!  !!      !!              !!  !!  !! 
          !!              !!                              !!   

just as we expected. It's not as solid-looking as the original, which explains why the author chose to use the table he did.

正如我们所料。它没有原版那么扎实,这就解释了为什么作者选择使用他所做的表格。

回答by chmeee

Let's format this for easier reading:

让我们格式化它以便于阅读:

main(_){
  _^448&&main(-~_);
  putchar((--_%64) ? (32|-(~7[__TIME__-_/8%8])[">'txiZ^(~z?"-48]>>(";;;====~$::199")[_*2&8|_/64]/(_&2?1:8)%8&1):10);
}

So, running it with no arguments, _ (argc conventionally) is 1. main()will recursively call itself, passing the result of -(~_)(negative bitwise NOT of _), so really it'll go 448 recursions (Only condition where _^448 == 0).

因此,在不带参数的情况下运行它, _ (通常为 argc)是1. main()将递归调用自身,传递-(~_)(负按位非_)的结果,因此实际上它会进行 448 次递归(仅条件 where _^448 == 0)。

Taking that, it'll print 7 64-character wide lines (the outer ternary condition, and 448/64 == 7). So let's rewrite it a little cleaner:

考虑到这一点,它将打印 7 个 64 字符宽的行(外部三元条件和448/64 == 7)。所以让我们把它改写得更干净一点:

main(int argc) {
  if (argc^448) main(-(~argc));
  if (argc % 64) {
    putchar((32|-(~7[__TIME__-argc/8%8])[">'txiZ^(~z?"-48]>>(";;;====~$::199")[argc*2&8|argc/64]/(argc&2?1:8)%8&1));
  } else putchar('\n');
}

Now, 32is decimal for ASCII space. It either prints a space or a '!' (33 is '!', hence the '&1' at the end). Let's focus on the blob in the middle:

现在,32是 ASCII 空间的十进制。它要么打印一个空格,要么打印一个 '!' (33 是 '!',因此&1是最后的 ' ')。让我们专注于中间的 blob:

-(~(7[__TIME__-argc/8%8][">'txiZ^(~z?"-48]) >>
     (";;;====~$::199"[argc*2&8|argc/64]) / (argc&2?1:8) % 8

As another poster said, __TIME__is the compile time for the program, and is a string, so there's some string arithmetic going on, as well as taking advantage of an array subscript being bidirectional: a[b] is the same as b[a] for character arrays.

正如另一位海报所说,__TIME__是程序的编译时间,并且是一个字符串,因此正在进行一些字符串算术,并利用数组下标的双向性:a[b] 与 b[a] 相同对于字符数组。

7[__TIME__ - (argc/8)%8]

This will select one of the first 8 characters in __TIME__. This is then indexed into [">'txiZ^(~z?"-48](0-9 characters are 48-57 decimal). The characters in this string must have been chosen for their ASCII values. This same character ASCII code manipulation continues through the expression, to result in the printing of either a ' ' or '!' depending on the location within the character's glyph.

这将选择 中的前 8 个字符之一__TIME__。然后将其编入索引[">'txiZ^(~z?"-48](0-9 个字符是 48-57 十进制)。此字符串中的字符必须已被选择作为其 ASCII 值。这个相同的字符 ASCII 代码操作继续通过表达式,导致打印 ' ' 或 '!' 取决于角色字形中的位置。

回答by Thomas Song

Adding to the other solutions, -~xis equal to x+1because ~xis equivalent to (0xffffffff-x). This is equal to (-1-x)in 2s complement, so -~xis -(-1-x) = x+1.

添加到其他解决方案,-~x等于x+1因为~x等价于(0xffffffff-x)。这等于(-1-x)在二进制补码,所以-~x-(-1-x) = x+1

回答by Lefteris E

I de-obfuscated the modulo arithmetics as much as I could and removed the reccursion

我尽可能地去混淆了模算术并删除了递归

int pixelX, line, digit ;
for(line=6; line >= 0; line--){
  for (digit =0; digit<8; digit++){
    for(pixelX=7;pixelX > 0; pixelX--){ 
        putchar(' '| 1 + ">'txiZ^(~z?"["12:34:56"[digit]-'0'] >> 
          (";;;====~$::199"[pixel*2 & 8  | line] / (pixelX&2 ? 1 : 8) ) % 8 & 1);               
    }
  }
  putchar('\n');
}

Expanding it a bit more:

再扩展一点:

int pixelX, line, digit, shift;
char shiftChar;
for(line=6; line >= 0; line--){
    for (digit =0; digit<8; digit++){
        for(pixelX=7;pixelX >= 0; pixelX--){ 
            shiftChar = ";;;====~$::199"[pixelX*2 & 8 | line];
            if (pixelX & 2)
                shift = shiftChar & 7;
            else
                shift = shiftChar >> 3;     
            putchar(' '| (">'txiZ^(~z?"["12:34:56"[digit]-'0'] + 1) >> shift & 1 );
        }

    }
    putchar('\n');
}