C++ 什么是无符号字符?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/75191/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is an unsigned char?
提问by Landon Kuhn
In C/C++, what an unsigned char
is used for? How is it different from a regular char
?
在 C/C++ 中,anunsigned char
是做什么用的?它和普通的有什么区别char
?
回答by Fruny
In C++, there are three distinctcharacter types:
在 C++ 中,存在三种不同的字符类型:
char
signed char
unsigned char
char
signed char
unsigned char
If you are using character types for text, use the unqualified char
:
如果您对text使用字符类型,请使用不合格的char
:
- it is the type of character literals like
'a'
or'0'
. - it is the type that makes up C strings like
"abcde"
- 它是字符文字的类型,例如
'a'
or'0'
。 - 它是构成 C 字符串的类型,例如
"abcde"
It also works out as a number value, but it is unspecified whether that value is treated as signed or unsigned. Beware character comparisons through inequalities - although if you limit yourself to ASCII (0-127) you're just about safe.
它也可以作为数字值计算,但未指定该值被视为有符号还是无符号。当心通过不等式进行字符比较 - 尽管如果您将自己限制为 ASCII (0-127),那么您就很安全了。
If you are using character types as numbers, use:
如果您使用字符类型作为数字,请使用:
signed char
, which gives you at leastthe -127 to 127 range. (-128 to 127 is common)unsigned char
, which gives you at leastthe 0 to 255 range.
signed char
,它至少为您提供-127 到 127 的范围。(-128 到 127 是常见的)unsigned char
,它至少为您提供0 到 255 的范围。
"At least", because the C++ standard only gives the minimum range of values that each numeric type is required to cover. sizeof (char)
is required to be 1 (i.e. one byte), but a byte could in theory be for example 32 bits. sizeof
would still be report its size as 1
- meaning that you couldhave sizeof (char) == sizeof (long) == 1
.
“至少”,因为 C++ 标准只给出了每个数字类型需要覆盖的最小范围的值。sizeof (char)
要求为 1(即一个字节),但理论上一个字节可以是例如 32 位。sizeof
仍然会报告它的大小1
- 这意味着你可以拥有sizeof (char) == sizeof (long) == 1
.
回答by Todd Gamblin
This is implementation dependent, as the C standard does NOT define the signed-ness of char
. Depending on the platform, char may be signed
or unsigned
, so you need to explicitly ask for signed char
or unsigned char
if your implementation depends on it. Just use char
if you intend to represent characters from strings, as this will match what your platform puts in the string.
这是依赖于实现的,因为 C 标准没有定义char
. 根据平台, char 可能是signed
或unsigned
,因此您需要明确要求signed char
或unsigned char
您的实现是否依赖于它。char
如果您打算表示字符串中的字符,只需使用,因为这将匹配您的平台放入字符串中的内容。
The difference between signed char
and unsigned char
is as you'd expect. On most platforms, signed char
will be an 8-bit two's complement number ranging from -128
to 127
, and unsigned char
will be an 8-bit unsigned integer (0
to 255
). Note the standard does NOT require that char
types have 8 bits, only that sizeof(char)
return 1
. You can get at the number of bits in a char with CHAR_BIT
in limits.h
. There are few if any platforms today where this will be something other than 8
, though.
signed char
和之间的区别unsigned char
正如您所期望的。在大多数平台上,signed char
将是一个 8 位二进制补码,范围从-128
to 127
,unsigned char
并将是一个 8 位无符号整数 ( 0
to 255
)。请注意,标准不要求char
类型具有 8 位,只要求sizeof(char)
return 1
。您可以使用CHAR_BIT
in获取字符中的位数limits.h
。不过,今天几乎没有任何平台可以提供除8
.
There is a nice summary of this issue here.
有这个问题的一个很好的总结在这里。
As others have mentioned since I posted this, you're better off using int8_t
and uint8_t
if you really want to represent small integers.
正如其他人在我发布这篇文章后提到的那样int8_t
,uint8_t
如果你真的想表示小整数,你最好使用。
回答by Johannes Schaub - litb
Because i feel it's really called for, i just want to state some rules of C and C++ (they are the same in this regard). First, all bitsof unsigned char
participate in determining the value if any unsigned char object. Second, unsigned char
is explicitly stated unsigned.
因为我觉得真的很需要,所以我只想说明一些C和C++的规则(在这方面它们是相同的)。首先,所有位都unsigned char
参与确定任何无符号字符对象的值。二unsigned char
是明确表示无符号。
Now, i had a discussion with someone about what happens when you convert the value -1
of type int to unsigned char
. He refused the idea that the resulting unsigned char
has all its bits set to 1, because he was worried about sign representation. But he don't have to. It's immediately following out of this rule that the conversion does what is intended:
现在,我与某人讨论了将-1
int 类型的值转换为unsigned char
. 他拒绝了结果的unsigned char
所有位都设置为 1的想法,因为他担心符号表示。但他没有必要。紧随此规则之后,转换会执行预期的操作:
If the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. (
6.3.1.3p2
in a C99 draft)
如果新类型是无符号的,则通过重复加或减一个新类型可以表示的最大值来转换该值,直到该值在新类型的范围内。(
6.3.1.3p2
在 C99 草案中)
That's a mathematical description. C++ describes it in terms of modulo calculus, which yields to the same rule. Anyway, what is notguaranteed is that all bits in the integer -1
are one before the conversion. So, what do we have so we can claim that the resulting unsigned char
has all its CHAR_BIT
bits turned to 1?
这是一个数学描述。C++ 用模演算来描述它,它产生相同的规则。无论如何,不能保证整数-1
中的所有位在转换前都是 1。那么,我们有什么东西才能声称结果的unsigned char
所有CHAR_BIT
位都变成了 1?
- All bits participate in determining its value - that is, no padding bits occur in the object.
- Adding only one time
UCHAR_MAX+1
to-1
will yield a value in range, namelyUCHAR_MAX
- 所有位都参与确定其值——也就是说,对象中不出现填充位。
- 只添加一次
UCHAR_MAX+1
to-1
将产生一个范围内的值,即UCHAR_MAX
That's enough, actually! So whenever you want to have an unsigned char
having all its bits one, you do
其实够了!所以每当你想拥有unsigned char
所有的东西时,你就这样做
unsigned char c = (unsigned char)-1;
It also follows that a conversion is notjust truncating higher order bits. The fortunate event for two's complementis that it is just a truncation there, but the same isn't necessarily true for other sign representations.
它也遵循一个转换不只是截断高阶位。二进制补码的幸运事件是它只是在那里截断,但对于其他符号表示不一定如此。
回答by Zachary Garrett
As for example usages of unsigned char:
例如unsigned char 的用法:
unsigned char
is often used in computer graphics, which very often (though not always) assigns a single byte to each colour component. It is common to see an RGB (or RGBA) colour represented as 24 (or 32) bits, each an unsigned char
. Since unsigned char
values fall in the range [0,255], the values are typically interpreted as:
unsigned char
经常用于计算机图形学,它经常(虽然不总是)为每个颜色分量分配一个字节。通常看到 RGB(或 RGBA)颜色表示为 24(或 32)位,每个unsigned char
. 由于unsigned char
值落在 [0,255] 范围内,因此这些值通常被解释为:
- 0 meaning a total lack of a given colour component.
- 255 meaning 100% of a given colour pigment.
- 0 表示完全没有给定的颜色成分。
- 255 表示 100% 的给定颜色颜料。
So you would end up with RGB red as (255,0,0) -> (100% red, 0% green, 0% blue).
所以你最终会得到 RGB 红色为 (255,0,0) -> (100% 红色,0% 绿色,0% 蓝色)。
Why not use a signed char
? Arithmetic and bit shifting becomes problematic. As explained already, a signed char
's range is essentially shifted by -128. A very simple and naive (mostly unused) method for converting RGB to grayscale is to average all three colour components, but this runs into problems when the values of the colour components are negative. Red (255, 0, 0) averages to (85, 85, 85) when using unsigned char
arithmetic. However, if the values were signed char
s (127,-128,-128), we would end up with (-99, -99, -99), which would be (29, 29, 29) in our unsigned char
space, which is incorrect.
为什么不使用signed char
? 算术和位移位变得有问题。如前所述, asigned char
的范围基本上移动了 -128。将 RGB 转换为灰度的一种非常简单和幼稚(大部分未使用)的方法是对所有三个颜色分量求平均值,但是当颜色分量的值为负时,这会遇到问题。使用unsigned char
算术时,红色 (255, 0, 0) 平均为 (85, 85, 85) 。但是,如果值是signed char
s (127,-128,-128),我们最终会得到 (-99, -99, -99),在我们的unsigned char
空间中将是 (29, 29, 29) ,这是不正确的.
回答by jbleners
If you want to use a character as a small integer, the safest way to do it is with the int8_t
and uint8_t
types.
如果要将字符用作小整数,最安全的方法是使用int8_t
和uint8_t
类型。
回答by munna
unsigned char
takes only positive values....like 0to 255
unsigned char
只取正值......比如0到255
where as
然而
signed char
takes both positive and negative values....like -128to +127
signed char
取正值和负值......比如-128到+127
回答by bk1e
char
and unsigned char
aren't guaranteed to be 8-bit types on all platforms—they are guaranteed to be 8-bit or larger. Some platforms have 9-bit, 32-bit, or 64-bit bytes. However, the most common platforms today (Windows, Mac, Linux x86, etc.) have 8-bit bytes.
char
并且unsigned char
不保证在所有平台上都是 8 位类型——它们保证是 8 位或更大。某些平台具有9 位、32 位或 64 位字节。但是,当今最常见的平台(Windows、Mac、Linux x86 等)具有 8 位字节。
回答by James Hopkin
signed char
has range -128 to 127; unsigned char
has range 0 to 255.
signed char
范围为 -128 到 127;unsigned char
范围为 0 到 255。
char
will be equivalent to either signed char or unsigned char, depending on the compiler, but is a distinct type.
char
将等价于有符号字符或无符号字符,具体取决于编译器,但是是不同的类型。
If you're using C-style strings, just use char
. If you need to use chars for arithmetic (pretty rare), specify signed or unsigned explicitly for portability.
如果您使用 C 风格的字符串,只需使用char
. 如果您需要将字符用于算术(很少见),请明确指定有符号或无符号以实现可移植性。
回答by Zac Gochenour
An unsigned char
is an unsigned byte value (0 to 255). You may be thinking of char
in terms of being a "character" but it is really a numerical value. The regular char
is signed, so you have 128 values, and these values map to characters using ASCII encoding. But in either case, what you are storing in memory is a byte value.
Anunsigned char
是一个无符号字节值(0 到 255)。您可能会认为char
是“字符”,但它实际上是一个数值。常规char
是有符号的,因此您有 128 个值,这些值使用 ASCII 编码映射到字符。但无论哪种情况,您在内存中存储的都是一个字节值。
回答by Julienne Walker
In terms of direct values a regular char is used when the values are known to be between CHAR_MIN
and CHAR_MAX
while an unsigned char provides double the range on the positive end. For example, if CHAR_BIT
is 8, the range of regular char
is only guaranteed to be [0, 127] (because it can be signed or unsigned) while unsigned char
will be [0, 255] and signed char
will be [-127, 127].
就直接值而言,当已知值介于CHAR_MIN
和之间时使用常规字符CHAR_MAX
,而无符号字符在正端提供两倍的范围。例如,如果CHAR_BIT
是 8,regular 的范围char
只能保证为 [0, 127](因为它可以是有符号或无符号的),而unsigned char
将是 [0, 255] 并且signed char
将是 [-127, 127]。
In terms of what it's used for, the standards allow objects of POD (plain old data) to be directly converted to an array of unsigned char. This allows you to examine the representation and bit patterns of the object. The same guarantee of safe type punning doesn't exist for char or signed char.
就其用途而言,标准允许 POD(纯旧数据)的对象直接转换为无符号字符数组。这允许您检查对象的表示和位模式。char 或signed char 不存在相同的安全类型双关保证。