C++ 什么是无符号字符?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/75191/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 12:40:02  来源:igfitidea点击:

What is an unsigned char?

c++cchar

提问by Landon Kuhn

In C/C++, what an unsigned charis used for? How is it different from a regular char?

在 C/C++ 中,anunsigned char是做什么用的?它和普通的有什么区别char

回答by Fruny

In C++, there are three distinctcharacter types:

在 C++ 中,存在三种不同的字符类型:

  • char
  • signed char
  • unsigned char
  • char
  • signed char
  • unsigned char

If you are using character types for text, use the unqualified char:

如果您对text使用字符类型,请使用不合格的char

  • it is the type of character literals like 'a'or '0'.
  • it is the type that makes up C strings like "abcde"
  • 它是字符文字的类型,例如'a'or '0'
  • 它是构成 C 字符串的类型,例如 "abcde"

It also works out as a number value, but it is unspecified whether that value is treated as signed or unsigned. Beware character comparisons through inequalities - although if you limit yourself to ASCII (0-127) you're just about safe.

它也可以作为数字值计算,但未指定该值被视为有符号还是无符号。当心通过不等式进行字符比较 - 尽管如果您将自己限制为 ASCII (0-127),那么您就很安全了。

If you are using character types as numbers, use:

如果您使用字符类型作为数字,请使用:

  • signed char, which gives you at leastthe -127 to 127 range. (-128 to 127 is common)
  • unsigned char, which gives you at leastthe 0 to 255 range.
  • signed char,它至少为您提供-127 到 127 的范围。(-128 到 127 是常见的)
  • unsigned char,它至少为您提供0 到 255 的范围。

"At least", because the C++ standard only gives the minimum range of values that each numeric type is required to cover. sizeof (char)is required to be 1 (i.e. one byte), but a byte could in theory be for example 32 bits. sizeofwould still be report its size as 1- meaning that you couldhave sizeof (char) == sizeof (long) == 1.

“至少”,因为 C++ 标准只给出了每个数字类型需要覆盖的最小范围的值。sizeof (char)要求为 1(即一个字节),但理论上一个字节可以是例如 32 位。sizeof仍然会报告它的大小1- 这意味着你可以拥有sizeof (char) == sizeof (long) == 1.

回答by Todd Gamblin

This is implementation dependent, as the C standard does NOT define the signed-ness of char. Depending on the platform, char may be signedor unsigned, so you need to explicitly ask for signed charor unsigned charif your implementation depends on it. Just use charif you intend to represent characters from strings, as this will match what your platform puts in the string.

这是依赖于实现的,因为 C 标准没有定义char. 根据平台, char 可能是signedunsigned,因此您需要明确要求signed charunsigned char您的实现是否依赖于它。char如果您打算表示字符串中的字符,只需使用,因为这将匹配您的平台放入字符串中的内容。

The difference between signed charand unsigned charis as you'd expect. On most platforms, signed charwill be an 8-bit two's complement number ranging from -128to 127, and unsigned charwill be an 8-bit unsigned integer (0to 255). Note the standard does NOT require that chartypes have 8 bits, only that sizeof(char)return 1. You can get at the number of bits in a char with CHAR_BITin limits.h. There are few if any platforms today where this will be something other than 8, though.

signed char和之间的区别unsigned char正如您所期望的。在大多数平台上,signed char将是一个 8 位二进制补码,范围从-128to 127unsigned char并将是一个 8 位无符号整数 ( 0to 255)。请注意,标准不要求char类型具有 8 位,只要求sizeof(char)return 1。您可以使用CHAR_BITin获取字符中的位数limits.h。不过,今天几乎没有任何平台可以提供除8.

There is a nice summary of this issue here.

有这个问题的一个很好的总结在这里

As others have mentioned since I posted this, you're better off using int8_tand uint8_tif you really want to represent small integers.

正如其他人在我发布这篇文章后提到的那样int8_tuint8_t如果你真的想表示小整数,你最好使用。

回答by Johannes Schaub - litb

Because i feel it's really called for, i just want to state some rules of C and C++ (they are the same in this regard). First, all bitsof unsigned charparticipate in determining the value if any unsigned char object. Second, unsigned charis explicitly stated unsigned.

因为我觉得真的很需要,所以我只想说明一些C和C++的规则(在这方面它们是相同的)。首先,所有位unsigned char参与确定任何无符号字符对象的值。二unsigned char是明确表示无符号。

Now, i had a discussion with someone about what happens when you convert the value -1of type int to unsigned char. He refused the idea that the resulting unsigned charhas all its bits set to 1, because he was worried about sign representation. But he don't have to. It's immediately following out of this rule that the conversion does what is intended:

现在,我与某人讨论了将-1int 类型的值转换为unsigned char. 他拒绝了结果的unsigned char所有位都设置为 1的想法,因为他担心符号表示。但他没有必要。紧随此规则之后,转换会执行预期的操作:

If the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. (6.3.1.3p2in a C99 draft)

如果新类型是无符号的,则通过重复加或减一个新类型可以表示的最大值来转换该值,直到该值在新类型的范围内。(6.3.1.3p2在 C99 草案中)

That's a mathematical description. C++ describes it in terms of modulo calculus, which yields to the same rule. Anyway, what is notguaranteed is that all bits in the integer -1are one before the conversion. So, what do we have so we can claim that the resulting unsigned charhas all its CHAR_BITbits turned to 1?

这是一个数学描述。C++ 用模演算来描述它,它产生相同的规则。无论如何,不能保证整数-1中的所有位在转换前都是 1。那么,我们有什么东西才能声称结果的unsigned char所有CHAR_BIT位都变成了 1?

  1. All bits participate in determining its value - that is, no padding bits occur in the object.
  2. Adding only one time UCHAR_MAX+1to -1will yield a value in range, namely UCHAR_MAX
  1. 所有位都参与确定其值——也就是说,对象中不出现填充位。
  2. 只添加一次UCHAR_MAX+1to-1将产生一个范围内的值,即UCHAR_MAX

That's enough, actually! So whenever you want to have an unsigned charhaving all its bits one, you do

其实够了!所以每当你想拥有unsigned char所有的东西时,你就这样做

unsigned char c = (unsigned char)-1;

It also follows that a conversion is notjust truncating higher order bits. The fortunate event for two's complementis that it is just a truncation there, but the same isn't necessarily true for other sign representations.

它也遵循一个转换只是截断高阶位。二进制补码的幸运事件是它只是在那里截断,但对于其他符号表示不一定如此。

回答by Zachary Garrett

As for example usages of unsigned char:

例如unsigned char 的用法:

unsigned charis often used in computer graphics, which very often (though not always) assigns a single byte to each colour component. It is common to see an RGB (or RGBA) colour represented as 24 (or 32) bits, each an unsigned char. Since unsigned charvalues fall in the range [0,255], the values are typically interpreted as:

unsigned char经常用于计算机图形学,它经常(虽然不总是)为每个颜色分量分配一个字节。通常看到 RGB(或 RGBA)颜色表示为 24(或 32)位,每个unsigned char. 由于unsigned char值落在 [0,255] 范围内,因此这些值通常被解释为:

  • 0 meaning a total lack of a given colour component.
  • 255 meaning 100% of a given colour pigment.
  • 0 表示完全没有给定的颜色成分。
  • 255 表示 100% 的给定颜色颜料。

So you would end up with RGB red as (255,0,0) -> (100% red, 0% green, 0% blue).

所以你最终会得到 RGB 红色为 (255,0,0) -> (100% 红色,0% 绿色,0% 蓝色)。

Why not use a signed char? Arithmetic and bit shifting becomes problematic. As explained already, a signed char's range is essentially shifted by -128. A very simple and naive (mostly unused) method for converting RGB to grayscale is to average all three colour components, but this runs into problems when the values of the colour components are negative. Red (255, 0, 0) averages to (85, 85, 85) when using unsigned chararithmetic. However, if the values were signed chars (127,-128,-128), we would end up with (-99, -99, -99), which would be (29, 29, 29) in our unsigned charspace, which is incorrect.

为什么不使用signed char? 算术和位移位变得有问题。如前所述, asigned char的范围基本上移动了 -128。将 RGB 转换为灰度的一种非常简单和幼稚(大部分未使用)的方法是对所有三个颜色分量求平均值,但是当颜色分量的值为负时,这会遇到问题。使用unsigned char算术时,红色 (255, 0, 0) 平均为 (85, 85, 85) 。但是,如果值是signed chars (127,-128,-128),我们最终会得到 (-99, -99, -99),在我们的unsigned char空间中将是 (29, 29, 29) ,这是不正确的.

回答by jbleners

If you want to use a character as a small integer, the safest way to do it is with the int8_tand uint8_ttypes.

如果要将字符用作小整数,最安全的方法是使用int8_tuint8_t类型。

回答by munna

unsigned chartakes only positive values....like 0to 255

unsigned char只取正值......比如0255

where as

然而

signed chartakes both positive and negative values....like -128to +127

signed char取正值和负值......比如-128+127

回答by bk1e

charand unsigned chararen't guaranteed to be 8-bit types on all platforms—they are guaranteed to be 8-bit or larger. Some platforms have 9-bit, 32-bit, or 64-bit bytes. However, the most common platforms today (Windows, Mac, Linux x86, etc.) have 8-bit bytes.

char并且unsigned char不保证在所有平台上都是 8 位类型——它们保证是 8 位或更大。某些平台具有9 位、32 位或 64 位字节。但是,当今最常见的平台(Windows、Mac、Linux x86 等)具有 8 位字节。

回答by James Hopkin

signed charhas range -128 to 127; unsigned charhas range 0 to 255.

signed char范围为 -128 到 127;unsigned char范围为 0 到 255。

charwill be equivalent to either signed char or unsigned char, depending on the compiler, but is a distinct type.

char将等价于有符号字符或无符号字符,具体取决于编译器,但是是不同的类型。

If you're using C-style strings, just use char. If you need to use chars for arithmetic (pretty rare), specify signed or unsigned explicitly for portability.

如果您使用 C 风格的字符串,只需使用char. 如果您需要将字符用于算术(很少见),请明确指定有符号或无符号以实现可移植性。

回答by Zac Gochenour

An unsigned charis an unsigned byte value (0 to 255). You may be thinking of charin terms of being a "character" but it is really a numerical value. The regular charis signed, so you have 128 values, and these values map to characters using ASCII encoding. But in either case, what you are storing in memory is a byte value.

Anunsigned char是一个无符号字节值(0 到 255)。您可能会认为char是“字符”,但它实际上是一个数值。常规char是有符号的,因此您有 128 个值,这些值使用 ASCII 编码映射到字符。但无论哪种情况,您在内存中存储的都是一个字节值。

回答by Julienne Walker

In terms of direct values a regular char is used when the values are known to be between CHAR_MINand CHAR_MAXwhile an unsigned char provides double the range on the positive end. For example, if CHAR_BITis 8, the range of regular charis only guaranteed to be [0, 127] (because it can be signed or unsigned) while unsigned charwill be [0, 255] and signed charwill be [-127, 127].

就直接值而言,当已知值介于CHAR_MIN和之间时使用常规字符CHAR_MAX,而无符号字符在正端提供两倍的范围。例如,如果CHAR_BIT是 8,regular 的范围char只能保证为 [0, 127](因为它可以是有符号或无符号的),而unsigned char将是 [0, 255] 并且signed char将是 [-127, 127]。

In terms of what it's used for, the standards allow objects of POD (plain old data) to be directly converted to an array of unsigned char. This allows you to examine the representation and bit patterns of the object. The same guarantee of safe type punning doesn't exist for char or signed char.

就其用途而言,标准允许 POD(纯旧数据)的对象直接转换为无符号字符数组。这允许您检查对象的表示和位模式。char 或signed char 不存在相同的安全类型双关保证。