C++ 将字节数组（char 数组）转换为整数类型（short、int、long）

Question

提问by Mike

I was wondering if system endianness matters when converting a byte array to a short / int / long. Would this be incorrect to do if the code runs on both big-endian and little-endian machines?

我想知道在将字节数组转换为 short/int/long 时系统字节序是否重要。如果代码在 big-endian 和 little-endian 机器上运行，这样做是否不正确？

short s = (b[0] << 8) | (b[1]);
int i = (b[0] << 24) | (b[1] << 16) | (b[2] << 8) | (b[3])

Answer 1

回答by Eli Iser

Yes, endianness matters. In little endian you have the most significant byte in the upper part of the short or int - i.e. bits 8-15 for short and 24-31 for int. For big endian the byte order would need to be reversed:

是的，字节序很重要。在 little endian 中，您在 short 或 int 的上半部分拥有最高有效字节 - 即 short 位 8-15 和 int 位 24-31。对于大端，字节顺序需要颠倒：

short s = ((b[1] << 8) | b[0]);
int i = (b[3] << 24) | (b[2] << 16) | (b[1] << 8) | (b[0]);

Note that this assumes that the byte array is in little endian order. Endianness and conversion between byte array and integer types depends not only on the endianness of the CPU but also on the endianness of the byte array data.

请注意，这假设字节数组是小端顺序。字节数组和整数类型之间的字节序和转换不仅取决于 CPU 的字节序，还取决于字节数组数据的字节序。

It is recommended to wrap these conversions in functions that will know (either via compilation flags or at run time) the endianness of the system and perform the conversion correctly.

建议将这些转换包装在将知道（通过编译标志或在运行时）系统字节序并正确执行转换的函数中。

In addition, creating a standard for the byte array data (always big endian, for example) and then using the socketntoh_sand ntoh_lwill offload the decision regarding endianness to the OS socketimplementation that is aware of such things. Note that the default network order is big endian (the nin ntoh_x), so having the byte array data as big endian would be the most straight forward way to do this.

此外，为字节数组数据创建一个标准（例如，总是大端），然后使用socketntoh_s和ntoh_l将把关于字节顺序的决定转移到socket意识到这些事情的操作系统实现上。请注意，默认网络顺序是大端（nin ntoh_x），因此将字节数组数据作为大端将是最直接的方法。

As pointed out by the OP (@Mike), boostalso provides endianness conversion functions.

正如 OP (@Mike) 所指出的，boost还提供了字节序转换功能。

Answer 2

回答by igntec

// on little endian:

unsigned char c[] = { 1, 0 };       // "one" in little endian order { LSB, MSB }

int a = (c[1] << 8) | c[0];         // a = 1

//----------------------------------------------------------------------------

//------------------------------------------------ -----------------------------

// on big endian:

unsigned char c[] = { 0, 1 };       // "one" in big endian order { MSB, LSB }

int a = (c[0] << 8) | c[1];         // a = 1

//----------------------------------------------------------------------------

//------------------------------------------------ -----------------------------

// on little endian:

unsigned char c[] = { 0, 1 };       // "one" in big endian order { MSB, LSB }

int a = (c[0] << 8) | c[1];         // a = 1 (reverse byte order)

//----------------------------------------------------------------------------

//------------------------------------------------ -----------------------------

// on big endian:

unsigned char c[] = { 1, 0 };       // "one" in little endian order { LSB, MSB }

int a = (c[1] << 8) | c[0];         // a = 1 (reverse byte order)

Answer 3

回答by KOLANICH

You can use unions for this. Endianness matters, to change it you can use x86 BSWAP instruction (or analogues for another platforms), provided by the most of c compilers as an intrinsic.

您可以为此使用联合。字节顺序很重要，要更改它，您可以使用大多数 c 编译器作为内在函数提供的 x86 BSWAP 指令（或其他平台的类似指令）。

#include <stdio.h>
typedef union{
  unsigned char bytes[8];
  unsigned short int words[4];
  unsigned int dwords[2];
  unsigned long long int qword;
} test;
int main(){
  printf("%d %d %d %d %d\n", sizeof(char), sizeof(short), sizeof(int), sizeof(long), sizeof(long long));
  test t;
  t.qword=0x0001020304050607u;
  printf("%02hhX|%02hhX|%02hhX|%02hhX|%02hhX|%02hhX|%02hhX|%02hhX\n",t.bytes[0],t.bytes[1] ,t.bytes[2],t.bytes[3],t.bytes[4],t.bytes[5],t.bytes[6],t.bytes[7]);
  printf("%04hX|%04hX|%04hX|%04hX\n" ,t.words[0] ,t.words[1] ,t.words[2] ,t.words[3]);
  printf("%08lX|%08lX\n" ,t.dwords[0] ,t.dwords[1]);
  printf("%016qX\n" ,t.qword);
  return 0;
}

Answer 4

回答by melpomene

No, that's fine as far as endianness is concerned, but you may have problems if your ints are only 16 bits wide.

不，就字节顺序而言这很好，但是如果您的ints 只有 16 位宽，您可能会遇到问题。

Answer 5

回答by xaxxon

The problem as you've specified, where you are using an existing byte array, will work fine across all machines. You will end up with the same answer.

您指定的问题，即您使用现有字节数组的地方，在所有机器上都可以正常工作。你最终会得到相同的答案。

However, depending on how you are creating that stream, it may be affected by endianness and you may not end up with the number you think you will.

但是，根据您创建该流的方式，它可能会受到字节顺序的影响，并且您最终可能不会得到您认为的数字。

C++ 将字节数组（char 数组）转换为整数类型（short、int、long）

提问by Mike

回答by Eli Iser

回答by igntec

回答by KOLANICH

回答by melpomene

回答by xaxxon

相关推荐

最近更新

标签

C++ 将字节数组（char 数组）转换为整数类型（short、int、long）

提问by Mike

回答by Eli Iser

回答by igntec

回答by KOLANICH

回答by melpomene

回答by xaxxon

相关推荐

C++ 覆盖单个文件的编译标志

C++ std::vector 元素是否保证是连续的？

删除向量 C++ 中的所有元素

C++ 如何解决“错误 LNK2019：未解析的外部符号”？

相关推荐

最近更新

标签