C++:如何将数组中的 2 个字节转换为无符号短整型

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/300808/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 14:25:49  来源:igfitidea点击:

C++: how to cast 2 bytes in an array to an unsigned short

c++pointerscasting

提问by user38784

I have been working on a legacy C++ application and am definitely outside of my comfort-zone (a good thing). I was wondering if anyone out there would be so kind as to give me a few pointers (pun intended).

我一直在研究遗留 C++ 应用程序,并且绝对超出了我的舒适区(一件好事)。我想知道是否有人会这么好心给我一些指点(双关语)。

I need to cast 2 bytes in an unsigned char array to an unsigned short. The bytes are consecutive.

我需要将 unsigned char 数组中的 2 个字节转换为 unsigned short。字节是连续的。

For an example of what I am trying to do:

对于我正在尝试做的一个例子:

I receive a string from a socket and place it in an unsigned char array. I can ignore the first byte and then the next 2 bytes should be converted to an unsigned char. This will be on windows only so there are no Big/Little Endian issues (that I am aware of).

我从套接字接收一个字符串并将它放在一个无符号字符数组中。我可以忽略第一个字节,然后接下来的 2 个字节应该转换为无符号字符。这将仅在 Windows 上出现,因此没有大/小端问题(我知道)。

Here is what I have now (not working obviously):

这是我现在所拥有的(显然不工作):

//packetBuffer is an unsigned char array containing the string "123456789" for testing
//I need to convert bytes 2 and 3 into the short, 2 being the most significant byte
//so I would expect to get 515 (2*256 + 3) instead all the code I have tried gives me
//either errors or 2 (only converting one byte
unsigned short myShort;
myShort = static_cast<unsigned_short>(packetBuffer[1])

回答by Johannes Schaub - litb

Well, you are widening the char into a short value. What you want is to interpret two bytes as an short. static_castcannot cast from unsigned char*to unsigned short*. You have to cast to void*, then to unsigned short*:

好吧,您正在将字符扩展为一个短值。你想要的是将两个字节解释为一个短字节。static_cast不能从unsigned char*to投射unsigned short*。你必须投射到void*,然后投射到unsigned short*

unsigned short *p = static_cast<unsigned short*>(static_cast<void*>(&packetBuffer[1]));

Now, you can dereference p and get the short value. But the problem with this approach is that you cast from unsigned char*, to void* and then to some different type. The Standard doesn't guarantee the address remains the same (and in addition, dereferencing that pointer would be undefined behavior). A better approach is to use bit-shifting, which will always work:

现在,您可以取消引用 p 并获得短值。但这种方法的问题在于,您从 unsigned char* 转换为 void*,然后转换为某种不同的类型。标准不保证地址保持不变(此外,取消引用该指针将是未定义的行为)。更好的方法是使用位移位,这将始终有效:

unsigned short p = (packetBuffer[1] << 8) | packetBuffer[2];

回答by ctacke

This is probably well below what you care about, but keep in mind that you could easily get an unaligned access doing this. x86 is forgiving and the abort that the unaligned access causes will be caught internally and will end up with a copy and return of the value so your app won't know any different (though it's significantly slower than an aligned access). If, however, this code will run on a non-x86 (you don't mention the target platform, so I'm assuming x86 desktop Windows), then doing this will cause a processor data abort and you'll have to manually copy the data to an aligned address before trying to cast it.

这可能远低于您关心的内容,但请记住,您可以轻松地获得未对齐的访问权限。x86 是宽容的,未对齐访问导致的中止将在内部捕获,并最终得到值的副本和返回,因此您的应用程序不会知道任何不同(尽管它比对齐访问慢得多)。但是,如果此代码将在非 x86 上运行(您没有提到目标平台,所以我假设是 x86 桌面 Windows),那么这样做将导致处理器数据中止,您必须手动复制在尝试将数据转换为对齐的地址之前。

In short, if you're going to be doing this access a lot, you might look at making adjustments to the code so as not to have unaligned reads and you'll see a perfromance benefit.

简而言之,如果您要经常进行这种访问,您可能会考虑对代码进行调整,以免出现未对齐的读取,并且您会看到性能上的好处。

回答by old_timer

The bit shift above has a bug:

上面的位移有一个错误:

unsigned short p = (packetBuffer[1] << 8) | packetBuffer[2];

if packetBufferis in bytes (8 bits wide) then the above shift can and will turn packetBufferinto a zero, leaving you with only packetBuffer[2];

如果packetBuffer以字节为单位(8 位宽),那么上述移位可以并且将变成packetBuffer零,只剩下packetBuffer[2];

Despite that this is still preferred to pointers. To avoid the above problem, I waste a few lines of code (other than quite-literal-zero-optimization) it results in the same machine code:

尽管如此,这仍然比指针更受欢迎。为了避免上述问题,我浪费了几行代码(除了字面意义的零优化之外)它会产生相同的机器代码:

unsigned short p;
p = packetBuffer[1]; p <<= 8; p |= packetBuffer[2];

Or to save some clock cycles and not shift the bits off the end:

或者为了节省一些时钟周期而不是将位移到最后:

unsigned short p;
p = (((unsigned short)packetBuffer[1])<<8) | packetBuffer[2];

You have to be careful with pointers, the optimizer will bite you, as well as memory alignments and a long list of other problems. Yes, done right it is faster, done wrong the bug can linger for a long time and strike when least desired.

你必须小心指针,优化器会咬你,还有内存对齐和一长串其他问题。是的,做对了它会更快,做错了这个 bug 可能会持续很长时间,并在最不想要的时候发生。

Say you were lazy and wanted to do some 16 bit math on an 8 bit array. (little endian)

假设您很懒惰,想在 8 位数组上进行一些 16 位数学运算。(小端)

unsigned short *s;
unsigned char b[10];

s=(unsigned short *)&b[0];

if(b[0]&7)
{
   *s = *s+8;
   *s &= ~7;
}

do_something_With(b);

*s=*s+8;

do_something_With(b);

*s=*s+8;

do_something_With(b);

There is no guarantee that a perfectly bug free compiler will create the code you expect. The byte array bsent to the do_something_with()function may never get modified by the *soperations. Nothing in the code above says that it should. If you don't optimize your code then you may never see this problem (until someone does optimize or changes compilers or compiler versions). If you use a debugger you may never see this problem (until it is too late).

不能保证完美无缺陷的编译器会创建您期望的代码。b发送到do_something_with()函数的字节数组可能永远不会被*s操作修改。上面的代码中没有任何内容说它应该。如果您不优化代码,那么您可能永远不会看到这个问题(直到有人优化或更改编译器或编译器版本)。如果您使用调试器,您可能永远不会看到这个问题(直到为时已晚)。

The compiler doesn't see the connection between s and b, they are two completely separate items. The optimizer may choose not to write *sback to memory because it sees that *shas a number of operations so it can keep that value in a register and only save it to memory at the end (if ever).

编译器看不到 s 和 b 之间的联系,它们是两个完全独立的项目。优化器可能会选择不写*s回内存,因为它看到*s有许多操作,因此它可以将该值保存在寄存器中,并仅在最后将其保存到内存中(如果有的话)。

There are three basic ways to fix the pointer problem above:

解决上述指针问题的基本方法有以下三种:

  1. Declare sas volatile.
  2. Use a union.
  3. Use a function or functions whenever changing types.
  1. 声明s为 volatile。
  2. 使用联合。
  3. 在更改类型时使用一个或多个函数。

回答by ilkayaktas

Maybe this is a very late solution but i just want to share with you. When you want to convert primitives or other types you can use union. See below:

也许这是一个很晚的解决方案,但我只想与您分享。当您想转换原语或其他类型时,您可以使用 union。见下文:

union CharToStruct {
    char charArray[2];
    unsigned short value;
};


short toShort(char* value){
    CharToStruct cs;
    cs.charArray[0] = value[1]; // most significant bit of short is not first bit of char array
    cs.charArray[1] = value[0];
    return cs.value;
}

When you create an array with below hex values and call toShort function, you will get a short value with 3.

当您创建一个具有以下十六进制值的数组并调用 toShort 函数时,您将获得一个值为 3 的短值。

char array[2]; 
array[0] = 0x00;
array[1] = 0x03;
short i = toShort(array);
cout << i << endl; // or printf("%h", i);

回答by sep

You should not cast a unsigned char pointer into an unsigned short pointer (for that matter cast from a pointer of smaller data type to a larger data type). This is because it is assumed that the address will be aligned correctly. A better approach is to shift the bytes into a real unsigned short object, or memcpy to a unsigned short array.

您不应将无符号字符指针转换为无符号短指针(就此而言,从较小数据类型的指针转​​换为较大数据类型)。这是因为假设地址将正确对齐。更好的方法是将字节转换为真正的无符号短对象,或将 memcpy 转换为无符号短数组。

No doubt, you can adjust the compiler settings to get around this limitation, but this is a very subtle thing that will break in the future if the code gets passed around and reused.

毫无疑问,您可以调整编译器设置来绕过这个限制,但是如果代码被传递和重用,这是一个非常微妙的事情,将来会破坏。

回答by PiNoYBoY82

unsigned short myShort = *(unsigned short *)&packetBuffer[1];

回答by arul

static cast has a different syntax, plus you need to work with pointers, what you want to do is:

静态转换有不同的语法,另外你需要使用指针,你想要做的是:

unsigned short *myShort = static_cast<unsigned short*>(&packetBuffer[1]);

回答by Martin York

Did nobody see the input was a string!

没有人看到输入是一个字符串!

/* If it is a string as explicitly stated in the question.
 */
int byte1 = packetBuffer[1] - '0'; // convert 1st byte from char to number.
int byte2 = packetBuffer[2] - '0';

unsigned short result = (byte1 * 256) + byte2;

/* Alternatively if is an array of bytes.
 */
int byte1 = packetBuffer[1];
int byte2 = packetBuffer[2];

unsigned short result = (byte1 * 256) + byte2;

This also avoids the problems with alignment that most of the other solutions may have on certain platforms. Note A short is at least two bytes. Most systems will give you a memory error if you try and de-reference a short pointer that is not 2 byte aligned (or whatever the sizeof(short) on your system is)!

这也避免了大多数其他解决方案在某些平台上可能存在的对齐问题。注意短至少是两个字节。如果您尝试取消引用不是 2 字节对齐的短指针(或系统上的任何 sizeof(short)),大多数系统都会给您一个内存错误!

回答by Martin York

char packetBuffer[] = {1, 2, 3};
unsigned short myShort = * reinterpret_cast<unsigned short*>(&packetBuffer[1]);

I (had to) do this all the time. big endian is an obvious problem. What really will get you is incorrect data when the machine dislike misaligned reads! (and write).

我(不得不)一直这样做。大端是一个明显的问题。当机器不喜欢未对齐的读取时,真正会得到的是不正确的数据!(和写)。

you may want to write a test cast and an assert to see if it reads properly. So when ran on a big endian machine or more importantly a machine that dislikes misaligned reads an assert error will occur instead of a weird hard to trace 'bug' ;)

您可能想编写一个测试转换和一个断言来查看它是否正确读取。因此,当在大端机器上运行时,或者更重要的是,在不喜欢未对齐读取的机器上运行时,将发生断言错误,而不是出现奇怪的难以追踪的“错误”;)

回答by Richard

On windows you can use:

在 Windows 上,您可以使用:

unsigned short i = MAKEWORD(lowbyte,hibyte);