C语言 类型转换 - 无符号到有符号 int/char

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17312545/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 06:48:36  来源:igfitidea点击:

Type conversion - unsigned to signed int/char

ctypestype-conversioninteger-promotionsignedness

提问by user2522685

I tried the to execute the below program:

我尝试执行以下程序:

#include <stdio.h>

int main() {
    signed char a = -5;
    unsigned char b = -5;
    int c = -5;
    unsigned int d = -5;

    if (a == b)
        printf("\r\n char is SAME!!!");
    else
        printf("\r\n char is DIFF!!!");

    if (c == d)
        printf("\r\n int is SAME!!!");
    else
        printf("\r\n int is DIFF!!!");

    return 0;
}

For this program, I am getting the output:

对于这个程序,我得到了输出:

char is DIFF!!! int is SAME!!!

字符是不同的!!!int 是一样的!!!

Why are we getting different outputs for both?
Should the output be as below ?

为什么我们得到不同的输出?
输出应该如下?

char is SAME!!! int is SAME!!!

字符是一样的!!!int 是一样的!!!

A codepad link.

一个键盘链接

回答by Lundin

This is because of the various implicit type conversion rules in C. There are two of them that a C programmer must know: the usual arithmetic conversionsand the integer promotions(the latter are part of the former).

这是因为 C 中的各种隐式类型转换规则。 C 程序员必须知道其中两个规则:通常的算术转换整数提升(后者是前者的一部分)。

In the char case you have the types (signed char) == (unsigned char). These are both small integer types. Other such small integer types are booland short. The integer promotion rulesstate that whenever a small integer type is an operand of an operation, its type will get promoted to int, which is signed. This will happen no matter if the type was signed or unsigned.

在 char 的情况下,您有 types (signed char) == (unsigned char)。这些都是小整数类型。其他这样的小整数类型是boolshort。该整数提升规则的状态,每当一个小的整数类型是操作的操作数,它的类型将得到提升到int,这是签署。无论类型是有符号还是无符号,都会发生这种情况。

In the case of the signed char, the sign will be preserved and it will be promoted to an intcontaining the value -5. In the case of the unsigned char, it contains a value which is 251 (0xFB ). It will be promoted to an intcontaining that same value. You end up with

在 的情况下signed char,符号将被保留,并将被提升为int包含值 -5 的 。在 的情况下unsigned char,它包含一个值为 251 (0xFB )。它将被提升为int包含相同值的一个。你最终得到

if( (int)-5 == (int)251 )


In the integer case you have the types (signed int) == (unsigned int). They are not small integer types, so the integer promotions do not apply. Instead, they are balanced by the usual arithmetic conversions, which state that if two operands have the same "rank" (size) but different signedness, the signed operand is converted to the same type as the unsigned one. You end up with

在整数情况下,您有 types (signed int) == (unsigned int)。它们不是小整数类型,因此整数提升不适用。相反,它们通过通常的算术转换来平衡,这表明如果两个操作数具有相同的“等级”(大小)但符号不同,则有符号操作数将转换为与无符号操作数相同的类型。你最终得到

if( (unsigned int)-5 == (unsigned int)-5)

回答by zmbq

Cool question!

很酷的问题!

The intcomparison works, because both ints contain exactly the same bits, so they are essentially the same. But what about the chars?

int比较有效,因为这两个整数包含完全相同的位,所以他们基本上是相同的。但是chars呢?

Ah, C implicitly promotes chars to ints on various occasions. This is one of them. Your code says if(a==b), but what the compiler actually turns that to is:

啊,C在各种场合隐式地将chars提升为ints。这是其中之一。您的代码说if(a==b),但编译器实际上将其转换为:

if((int)a==(int)b) 

(int)ais -5, but (int)bis 251. Those are definitely not the same.

(int)a是-5,但是(int)b是251。这些绝对不一样。

EDIT: As @Carbonic-Acid pointed out, (int)bis 251 only if a charis 8 bits long. If intis 32 bits long, (int)bis -32764.

编辑:正如@Carbonic-Acid 所指出的,(int)b只有当 achar为 8 位长时才为 251 。如果int是 32 位长,(int)b则为 -32764。

REDIT: There's a whole bunch of comments discussing the nature of the answer if a byte is not 8 bits long. The only difference in this case is that (int)bis not 251 but a different positivenumber, which isn't -5. This is not really relevant to the question which is still very cool.

REDIT:有一大堆评论讨论了一个字节不是 8 位长的答案的性质。在这种情况下,唯一的区别(int)b不是 251,而是不同的数,不是 -5。这与仍然非常酷的问题无关。

回答by Nobilis

Welcome to integer promotion. If I may quote from the website:

欢迎整数推广。如果我可以从网站上引用:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

如果一个 int 可以表示原始类型的所有值,则将该值转换为 int;否则,它被转换为无符号整数。这些被称为整数提升。整数提升不会改变所有其他类型。

C can be really confusing when you do comparisons such as these, I recently puzzled some of my non-C programming friends with the following tease:

当你做这样的比较时,C 真的很令人困惑,我最近用以下挑逗让我的一些非 C 编程朋友感到困惑:

#include <stdio.h>
#include <string.h>

int main()
{
    char* string = "One looooooooooong string";

    printf("%d\n", strlen(string));

    if (strlen(string) < -1) printf("This cannot be happening :(");

    return 0;
}

Which indeed does print This cannot be happening :(and seemingly demonstrates that 25 is smaller than -1!

这确实打印This cannot be happening :(并且似乎表明 25 小于 -1!

What happens underneath however is that -1 is represented as an unsigned integer which due to the underlying bits representation is equal to 4294967295 on a 32 bit system. And naturally 25 is smaller than 4294967295.

然而,下面发生的是 -1 表示为一个无符号整数,由于底层位表示在 32 位系统上等于 4294967295。自然 25 小于 4294967295。

If we however explicitly cast the size_ttype returned by strlenas a signed integer:

但是,如果我们将size_t返回的类型显式转换strlen为有符号整数:

if ((int)(strlen(string)) < -1)

Then it will compare 25 against -1 and all will be well with the world.

然后它将 25 与 -1 进行比较,一切都会好起来的。

A good compiler should warn you about the comparison between an unsigned and signed integer and yet it is still so easy to miss (especially if you don't enable warnings).

一个好的编译器应该警告你无符号和有符号整数之间的比较,但它仍然很容易被遗漏(特别是如果你没有启用警告)。

This is especially confusing for Java programmers as all primitive types there are signed. Here's what James Gosling (one of the creators of Java) had to say on the subject:

这对于 Java 程序员来说尤其令人困惑,因为那里的所有原始类型都是有符号的。以下是 James Gosling(Java 的创造者之一)对这个主题的看法

Gosling: For me as a language designer, which I don't really count myself as these days, what "simple" really ended up meaning was could I expect J. Random Developer to hold the spec in his head. That definition says that, for instance, Java isn't -- and in fact a lot of these languages end up with a lot of corner cases, things that nobody really understands. Quiz any C developer about unsigned, and pretty soon you discover that almost no C developers actually understand what goes on with unsigned, what unsigned arithmetic is. Things like that made C complex. The language part of Java is, I think, pretty simple. The libraries you have to look up.

Gosling:对于我作为一名语言设计师来说,我现在并不真正将自己视为自己,“简单”的真正含义是我能期望 J. Random Developer 将规范牢记在心。这个定义说,例如,Java 不是——事实上,很多这些语言最终都有很多极端情况,没有人真正理解的东西。向任何 C 开发人员询问有关无符号的问题,很快您就会发现几乎没有 C 开发人员真正了解无符号的情况,什么是无符号算术。诸如此类的事情使 C 变得复杂。我认为 Java 的语言部分非常简单。您必须查找的库。

回答by ams

The hex representation of -5is:

的十六进制表示-5为:

  • 8-bit, two's complement signed char: 0xfb
  • 32-bit, two's complement signed int: 0xfffffffb
  • 8 位,二进制补码signed char0xfb
  • 32 位,二进制补码signed int0xfffffffb

When you convert a signed number to an unsigned number, or vice versa, the compiler does ... precisely nothing. What is there to do? The number is either convertible or it isn't, in which case undefined or implementation-defined behaviour follows (I've not actually checked which) and the most efficient implementation-defined behaviour is to do nothing.

当您将有符号数转换为无符号数时,反之亦然,编译器会做……完全没有。有什么可做的?该数字要么可转换,要么不可转换,在这种情况下,未定义或实现定义的行为如下(我实际上没有检查哪个)并且最有效的实现定义的行为是什么都不做。

So, the hex representation of (unsigned <type>)-5is:

因此,的十六进制表示(unsigned <type>)-5为:

  • 8-bit, unsigned char: 0xfb
  • 32-bit, unsigned int: 0xfffffffb
  • 8 位,unsigned char0xfb
  • 32 位,unsigned int0xfffffffb

Look familiar? They're bit-for-bit the same as the signed versions.

看起来熟悉?它们与签名版本一点一点相同。

When you write if (a == b), where aand bare of type char, what the compiler is actually required to read is if ((int)a == (int)b). (This is that "integer promotion" that everyone else is banging on about.)

当您编写if (a == b)whereabare 类型时char,编译器实际需要读取的是if ((int)a == (int)b). (这就是其他人都在谈论的“整数提升”。)

So, what happens when we convert charto int?

那么,当我们转换char为时会发生什么int

  • 8-bit signed charto 32-bit signed int: 0xfb-> 0xfffffffb
    • Well, that makes sense because it matches the representations of -5above!
    • It's called a "sign-extend", because it copies the top bit of the byte, the "sign-bit", leftwards into the new, wider value.
  • 8-bit unsigned charto 32-bit signed int: 0xfb-> 0x000000fb
    • This time it does a "zero-extend" because the source type is unsigned, so there is no sign-bit to copy.
  • 8 位signed char到 32 位signed int0xfb->0xfffffffb
    • 嗯,这是有道理的,因为它与-5上面的表示相匹配!
    • 它被称为“符号扩展”,因为它将字节的最高位“符号位”向左复制到新的更宽的值中。
  • 8 位unsigned char到 32 位signed int0xfb->0x000000fb
    • 这次它进行了“零扩展”,因为源类型是unsigned,因此没有要复制的符号位。

So, a == breally does 0xfffffffb == 0x000000fb=> no match!

所以,a == b真的0xfffffffb == 0x000000fb=>不匹配!

And, c == dreally does 0xfffffffb == 0xfffffffb=> match!

而且,c == d真的0xfffffffb == 0xfffffffb=>匹配!

回答by Antonio

My point is: didn't you get a warning at compile time "comparing signed and unsigned expression"?

我的观点是:您在编译时没有收到警告“比较有符号和无符号表达式”吗?

The compiler is trying to inform you that he is entitled to do crazy stuff! :) I would add, crazy stuff will happen using big values, close to the capacity of the primitive type. And

编译器试图通知你他有权做疯狂的事情!:) 我想补充一点,使用大值会发生疯狂的事情,接近原始类型的容量。和

 unsigned int d = -5;

is assigning definitely a big value to d, it's equivalent (even if, probably not guaranteed to be equivalent) to be:

肯定为 d 分配了一个大值,它等效(即使可能不保证等效)为:

 unsigned int d = UINT_MAX -4; ///Since -1 is UINT_MAX

Edit:

编辑:

However, it is interesting to notice that only the second comparison gives a warning (check the code). So it means that the compiler applying the conversion rules is confident that there won't be errors in the comparison between unsigned charand char(during comparison they will be converted to a type that can safely represent all its possible values). And he is right on this point. Then, it informs you that this won't be the case for unsigned intand int: during the comparison one of the 2 will be converted to a type that cannot fully represent it.

然而,有趣的是注意到只有第二次比较会给出警告(检查代码)。因此,这意味着应用转换规则的编译器确信在unsigned char和之间的比较中不会出现错误char(在比较期间,它们将被转换为可以安全地表示其所有可能值的类型)。在这一点上他是对的。然后,它会通知您unsigned intand不会出现这种情况int:在比较期间,这 2 个中的一个将转换为不能完全表示它的类型。

For completeness, I checked it also for short: the compiler behaves in the same way than for chars, and, as expected, there are no errors at runtime.

为了完整起见,我也简短地检查了一下:编译器的行为方式与字符相同,并且正如预期的那样,在运行时没有错误。

.

.

Related to this topic, I recently asked this question(yet, C++ oriented).

与此主题相关,我最近问了这个问题(但是,面向 C++)。