C++ 将字节数组(char 数组)转换为整数类型(short、int、long)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13678166/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 17:37:37  来源:igfitidea点击:

Converting byte array (char array) to an integer type (short, int, long)

c++

提问by Mike

I was wondering if system endianness matters when converting a byte array to a short / int / long. Would this be incorrect to do if the code runs on both big-endian and little-endian machines?

我想知道在将字节数组转换为 short/int/long 时系统字节序是否重要。如果代码在 big-endian 和 little-endian 机器上运行,这样做是否不正确?

short s = (b[0] << 8) | (b[1]);
int i = (b[0] << 24) | (b[1] << 16) | (b[2] << 8) | (b[3])

回答by Eli Iser

Yes, endianness matters. In little endian you have the most significant byte in the upper part of the short or int - i.e. bits 8-15 for short and 24-31 for int. For big endian the byte order would need to be reversed:

是的,字节序很重要。在 little endian 中,您在 short 或 int 的上半部分拥有最高有效字节 - 即 short 位 8-15 和 int 位 24-31。对于大端,字节顺序需要颠倒:

short s = ((b[1] << 8) | b[0]);
int i = (b[3] << 24) | (b[2] << 16) | (b[1] << 8) | (b[0]);

Note that this assumes that the byte array is in little endian order. Endianness and conversion between byte array and integer types depends not only on the endianness of the CPU but also on the endianness of the byte array data.

请注意,这假设字节数组是小端顺序。字节数组和整数类型之间的字节序和转换不仅取决于 CPU 的字节序,还取决于字节数组数据的字节序。

It is recommended to wrap these conversions in functions that will know (either via compilation flags or at run time) the endianness of the system and perform the conversion correctly.

建议将这些转换包装在将知道(通过编译标志或在运行时)系统字节序并正确执行转换的函数中。

In addition, creating a standard for the byte array data (always big endian, for example) and then using the socketntoh_sand ntoh_lwill offload the decision regarding endianness to the OS socketimplementation that is aware of such things. Note that the default network order is big endian (the nin ntoh_x), so having the byte array data as big endian would be the most straight forward way to do this.

此外,为字节数组数据创建一个标准(例如,总是大端),然后使用socketntoh_sntoh_l将把关于字节顺序的决定转移到socket意识到这些事情的操作系统实现上。请注意,默认网络顺序是大端(nin ntoh_x),因此将字节数组数据作为大端将是最直接的方法。

As pointed out by the OP (@Mike), boostalso provides endianness conversion functions.

正如 OP (@Mike) 所指出的,boost还提供了字节序转换功能。

回答by igntec

// on little endian:

unsigned char c[] = { 1, 0 };       // "one" in little endian order { LSB, MSB }

int a = (c[1] << 8) | c[0];         // a = 1

//----------------------------------------------------------------------------

//------------------------------------------------ -----------------------------

// on big endian:

unsigned char c[] = { 0, 1 };       // "one" in big endian order { MSB, LSB }

int a = (c[0] << 8) | c[1];         // a = 1

//----------------------------------------------------------------------------

//------------------------------------------------ -----------------------------

// on little endian:

unsigned char c[] = { 0, 1 };       // "one" in big endian order { MSB, LSB }

int a = (c[0] << 8) | c[1];         // a = 1 (reverse byte order)

//----------------------------------------------------------------------------

//------------------------------------------------ -----------------------------

// on big endian:

unsigned char c[] = { 1, 0 };       // "one" in little endian order { LSB, MSB }

int a = (c[1] << 8) | c[0];         // a = 1 (reverse byte order)

回答by KOLANICH

You can use unions for this. Endianness matters, to change it you can use x86 BSWAP instruction (or analogues for another platforms), provided by the most of c compilers as an intrinsic.

您可以为此使用联合。字节顺序很重要,要更改它,您可以使用大多数 c 编译器作为内在函数提供的 x86 BSWAP 指令(或其他平台的类似指令)。

#include <stdio.h>
typedef union{
  unsigned char bytes[8];
  unsigned short int words[4];
  unsigned int dwords[2];
  unsigned long long int qword;
} test;
int main(){
  printf("%d %d %d %d %d\n", sizeof(char), sizeof(short), sizeof(int), sizeof(long), sizeof(long long));
  test t;
  t.qword=0x0001020304050607u;
  printf("%02hhX|%02hhX|%02hhX|%02hhX|%02hhX|%02hhX|%02hhX|%02hhX\n",t.bytes[0],t.bytes[1] ,t.bytes[2],t.bytes[3],t.bytes[4],t.bytes[5],t.bytes[6],t.bytes[7]);
  printf("%04hX|%04hX|%04hX|%04hX\n" ,t.words[0] ,t.words[1] ,t.words[2] ,t.words[3]);
  printf("%08lX|%08lX\n" ,t.dwords[0] ,t.dwords[1]);
  printf("%016qX\n" ,t.qword);
  return 0;
}

回答by melpomene

No, that's fine as far as endianness is concerned, but you may have problems if your ints are only 16 bits wide.

不,就字节顺序而言这很好,但是如果您的ints 只有 16 位宽,您可能会遇到问题。

回答by xaxxon

The problem as you've specified, where you are using an existing byte array, will work fine across all machines. You will end up with the same answer.

您指定的问题,即您使用现有字节数组的地方,在所有机器上都可以正常工作。你最终会得到相同的答案。

However, depending on how you are creating that stream, it may be affected by endianness and you may not end up with the number you think you will.

但是,根据您创建该流的方式,它可能会受到字节顺序的影响,并且您最终可能不会得到您认为的数字。