C++ 从有符号字符转换为无符号字符然后再转换回来？

Question

提问by rbcc

I'm working with JNI and have an array of type jbyte, where jbyte is represented as an signed char i.e. ranging from -128 to 127. The jbytes represent image pixels. For image processing, we usually want pixel components to range from 0 to 255. I therefore want to convert the jbyte value to the range 0 to 255 (i.e. the same range as unsigned char), do some calculations on the value and then store the result as a jbyte again.

我正在使用 JNI 并有一个 jbyte 类型的数组，其中 jbyte 表示为有符号字符，即范围从 -128 到 127。jbytes 表示图像像素。对于图像处理，我们通常希望像素分量的范围从 0 到 255。因此我想将 jbyte 值转换为范围 0 到 255（即与 unsigned char 相同的范围），对值进行一些计算，然后存储结果再次作为 jbyte。

How can I do these conversion safely?

我怎样才能安全地进行这些转换？

I managed to get this code to work, where a pixel value is incremented by 30 but clamped to the value 255, but I don't understand if it's safe or portable:

我设法让这段代码工作，其中像素值增加了 30 但被限制为值 255，但我不明白它是否安全或便携：

 #define CLAMP255(v) (v > 255 ? 255 : (v < 0 ? 0 : v))

 jbyte pixel = ...
 pixel = CLAMP_255((unsigned char)pixel + 30);

I'm interested to know how to do this in both C and C++.

我很想知道如何在 C 和 C++ 中做到这一点。

Answer 1

回答by wich

This is one of the reasons why C++ introduced the new cast style, which includes static_castand reinterpret_cast

这是 C++ 引入新类型转换的原因之一，其中包括static_cast和reinterpret_cast

There's two things you can mean by saying conversion from signed to unsigned, you might mean that you wish the unsigned variable to contain the value of the signed variable modulo the maximum value of your unsigned type + 1. That is if your signed char has a value of -128 then CHAR_MAX+1is added for a value of 128 and if it has a value of -1, then CHAR_MAX+1is added for a value of 255, this is what is done by static_cast. On the other hand you might mean to interpret the bit value of the memory referenced by some variable to be interpreted as an unsigned byte, regardless of the signed integer representation used on the system, i.e. if it has bit value 0b10000000it should evaluate to value 128, and 255 for bit value 0b11111111, this is accomplished with reinterpret_cast.

通过说从有符号到无符号的转换，你可以指两件事，你可能是说你希望无符号变量包含有符号变量的值，以你的无符号类型的最大值 + 1 为模。也就是说，如果你的有符号字符有一个然后CHAR_MAX+1为 128 的值添加-128的值，如果它的值为 -1，CHAR_MAX+1则为 255 的值添加，这就是 static_cast 所做的。另一方面，您可能想将某个变量引用的内存的位值解释为无符号字节，而不管系统上使用的有符号整数表示如何，即如果它具有位值0b10000000，则应评估为值 128和 255 位值0b11111111，这是通过 reinterpret_cast 实现的。

Now, for the two's complement representation this happens to be exactly the same thing, since -128 is represented as 0b10000000and -1 is represented as 0b11111111and likewise for all in between. However other computers (usually older architectures) may use different signed representation such as sign-and-magnitude or ones' complement. In ones' complement the 0b10000000bitvalue would not be -128, but -127, so a static cast to unsigned char would make this 129, while a reinterpret_cast would make this 128. Additionally in ones' complement the 0b11111111bitvalue would not be -1, but -0, (yes this value exists in ones' complement,) and would be converted to a value of 0 with a static_cast, but a value of 255 with a reinterpret_cast. Note that in the case of ones' complement the unsigned value of 128 can actually not be represented in a signed char, since it ranges from -127 to 127, due to the -0 value.

现在，对于两个的补码表示，这恰好是一回事，因为 -128 表示为0b10000000，-1 表示为0b11111111，对于介于两者之间的所有内容也是如此。然而，其他计算机（通常是较旧的体系结构）可能使用不同的有符号表示，例如符号和大小或补码。在一个的补码中，位0b10000000值不会是 -128，而是 -127，因此静态转换为 unsigned char 将使其成为 129，而 reinterpret_cast 将使其成为 128。另外在一个的补码中0b11111111bitvalue 不会是 -1，而是 -0，（是的，这个值存在于一个人的补码中），并且会通过 static_cast 转换为 0 值，但通过 reinterpret_cast 转换为 255 值。请注意，在二进制补码的情况下，128 的无符号值实际上不能用有符号字符表示，因为它的范围是 -127 到 127，这是由于 -0 值。

I have to say that the vast majority of computers will be using two's complement making the whole issue moot for just about anywhere your code will ever run. You will likely only ever see systems with anything other than two's complement in very old architectures, think '60s timeframe.

我不得不说，绝大多数计算机将使用二进制补码，这使得整个问题对于您的代码将运行的几乎任何地方都没有实际意义。想想 60 年代的时间框架，您可能只会在非常古老的体系结构中看到除二进制补码以外的任何系统。

The syntax boils down to the following:

语法归结为以下几点：

signed char x = -100;
unsigned char y;

y = (unsigned char)x;                    // C static
y = *(unsigned char*)(&x);               // C reinterpret
y = static_cast<unsigned char>(x);       // C++ static
y = reinterpret_cast<unsigned char&>(x); // C++ reinterpret

To do this in a nice C++ way with arrays:

要以一种很好的 C++ 方式使用数组来做到这一点：

jbyte memory_buffer[nr_pixels];
unsigned char* pixels = reinterpret_cast<unsigned char*>(memory_buffer);

or the C way:

或 C 方式：

unsigned char* pixels = (unsigned char*)memory_buffer;

Answer 2

回答by qbert220

Yes this is safe.

是的，这是安全的。

The c language uses a feature called integer promotion to increase the number of bits in a value before performing calculations. Therefore your CLAMP255 macro will operate at integer (probably 32 bit) precision. The result is assigned to a jbyte, which reduces the integer precision back to 8 bits fit in to the jbyte.

c 语言使用称为整数提升的功能在执行计算之前增加值中的位数。因此，您的 CLAMP255 宏将以整数（可能是 32 位）精度运行。结果被分配给一个 jbyte，这将整数精度降低回适合 jbyte 的 8 位。

Answer 3

回答by Daniel Hilgarth

Do you realize, that CLAMP255 returns 0 for v < 0 and 255 for v >= 0?
IMHO, CLAMP255 should be defined as:

您是否意识到，CLAMP255 对于 v < 0 返回 0，对于 v >= 0 返回 255？
恕我直言，CLAMP255 应定义为：

#define CLAMP255(v) (v > 255 ? 255 : (v < 0 ? 0 : v))

Difference: If v is not greater than 255 and not less than 0: return v instead of 255

区别：如果v不大于255且不小于0：返回v而不是255

Answer 4

回答by Simon Richter

There are two ways to interpret the input data; either -128 is the lowest value, and 127 is the highest (i.e. true signed data), or 0 is the lowest value, 127 is somewhere in the middle, and the next "higher" number is -128, with -1 being the "highest" value (that is, the most significant bit already got misinterpreted as a sign bit in a two's complement notation.

有两种方式来解释输入数据；-128 是最低值，127 是最高值（即真正的有符号数据），或者 0 是最低值，127 是中间值，下一个“更高”的数字是 -128，-1 是“最高”值（即，最高有效位已被误解为二进制补码表示法中的符号位。

Assuming you mean the latter, the formally correct way is

假设你的意思是后者，正式正确的方法是

signed char in = ...
unsigned char out = (in < 0)?(in + 256):in;

which at least gcc properly recognizes as a no-op.

至少 gcc 正确识别为无操作。

Answer 5

回答by ZeRemz

I'm not 100% sure that I understand your question, so tell me if I'm wrong.

我不是 100% 确定我理解你的问题，所以如果我错了，请告诉我。

If I got it right, you are reading jbytes that are technicallysigned chars, but reallypixel values ranging from 0 to 255, and you're wondering how you should handle them without corrupting the values in the process.

如果我这样做是正确，你正在阅读那些jbytes技术上签署字符，但真正的像素值范围从0到255，和你想知道应该如何处理它们而不在此过程中损坏的值。

Then, you should do the following:

然后，您应该执行以下操作：

convert jbytes to unsigned char before doing anything else, this will definetly restore the pixel values you are trying to manipulate
use a larger signed integer type, such as int while doing intermediate calculations, this to make sure that over- and underflows can be detected and dealt with (in particular, notcasting to a signed type could force to compiler to promote every type to an unsigned type in which case you wouldn't be able to detect underflows later on)
when assigning back to a jbyte, you'll want to clamp your value to the 0-255 range, convert to unsigned char and then convert again to signed char: I'm not certain the first conversion is strictly necessary, but you just can't be wrong if you do both

在执行任何其他操作之前将 jbytes 转换为 unsigned char，这将明确恢复您尝试操作的像素值
在进行中间计算时使用更大的有符号整数类型，例如 int，以确保可以检测和处理上溢和下溢（特别是，不强制转换为有符号类型可能会迫使编译器将每种类型提升为unsigned 类型，在这种情况下，您以后将无法检测到下溢）
分配回 jbyte 时，您需要将值限制在 0-255 范围内，转换为无符号字符，然后再次转换为有符号字符：我不确定第一次转换是否绝对必要，但您可以如果你两者都做就不会错

For example:

例如：

inline int fromJByte(jbyte pixel) {
    // cast to unsigned char re-interprets values as 0-255
    // cast to int will make intermediate calculations safer
    return static_cast<int>(static_cast<unsigned char>(pixel));
}

inline jbyte fromInt(int pixel) {
    if(pixel < 0)
        pixel = 0;

    if(pixel > 255)
        pixel = 255;

    return static_cast<jbyte>(static_cast<unsigned char>(pixel));
}

jbyte in = ...
int intermediate = fromJByte(in) + 30;
jbyte out = fromInt(intermediate);

C++ 从有符号字符转换为无符号字符然后再转换回来？

提问by rbcc

回答by wich

回答by qbert220

回答by Daniel Hilgarth

回答by Simon Richter

回答by ZeRemz

相关推荐

最近更新

标签

C++ 从有符号字符转换为无符号字符然后再转换回来？

提问by rbcc

回答by wich

回答by qbert220

回答by Daniel Hilgarth

回答by Simon Richter

回答by ZeRemz

相关推荐

编译时动态链接库不生成 .lib 文件 (Visual Studio C++ Express)

C++ 具有负值的模运算符

在 C++ 类中初始化静态变量？

C++ 如何获取 std::string 的尾部？

相关推荐

最近更新

标签