在 C++ 程序中以编程方式检测字节序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1001307/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 18:21:19  来源:igfitidea点击:

Detecting endianness programmatically in a C++ program

c++algorithmendianness

提问by Jay T

Is there a programmatic way to detect whether or not you are on a big-endian or little-endian architecture? I need to be able to write code that will execute on an Intel or PPC system and use exactly the same code (i.e. no conditional compilation).

有没有一种程序化的方法来检测你是在大端还是小端架构上?我需要能够编写在 Intel 或 PPC 系统上执行的代码并使用完全相同的代码(即没有条件编译)。

回答by David Cournapeau

I don't like the method based on type punning - it will often be warned against by compiler. That's exactly what unions are for !

我不喜欢基于类型双关的方法——编译器经常会警告它。这正是工会的用途!

bool is_big_endian(void)
{
    union {
        uint32_t i;
        char c[4];
    } bint = {0x01020304};

    return bint.c[0] == 1; 
}

The principle is equivalent to the type case as suggested by others, but this is clearer - and according to C99, is guaranteed to be correct. gcc prefers this compared to the direct pointer cast.

原理等同于其他人建议的类型案例,但这更清晰 - 根据 C99,保证是正确的。与直接指针转换相比,gcc 更喜欢这个。

This is also much better than fixing the endianness at compile time - for OS which support multi-architecture (fat binary on Mac os x for example), this will work for both ppc/i386, whereas it is very easy to mess things up otherwise.

这也比在编译时修复字节序要好得多 - 对于支持多架构的操作系统(例如 Mac os x 上的胖二进制文件),这对 ppc/i386 都适用,否则很容易搞砸.

回答by Eric Petroelje

You can do it by setting an int and masking off bits, but probably the easiest way is just to use the built in network byte conversion ops (since network byte order is always big endian).

您可以通过设置 int 和屏蔽位来实现,但可能最简单的方法是使用内置的网络字节转换操作(因为网络字节顺序始终是大端)。

if ( htonl(47) == 47 ) {
  // Big endian
} else {
  // Little endian.
}

Bit fiddling could be faster, but this way is simple, straightforward and pretty impossible to mess up.

比特摆弄可能会更快,但这种方式简单、直接且不可能搞砸。

回答by Andrew Hare

Please see this article:

请看这篇文章

Here is some code to determine what is the type of your machine

int num = 1;
if(*(char *)&num == 1)
{
    printf("\nLittle-Endian\n");
}
else
{
    printf("Big-Endian\n");
}

这是一些代码来确定您的机器类型

int num = 1;
if(*(char *)&num == 1)
{
    printf("\nLittle-Endian\n");
}
else
{
    printf("Big-Endian\n");
}

回答by Lyberta

You can use std::endianif you have access to C++20 compiler such as GCC 8+ or Clang 7+.

std::endian如果您有权访问 C++20 编译器,例如 GCC 8+ 或 Clang 7+,则可以使用。

Note: std::endianbegan in <type_traits>but was movedto <bit>at 2019 Cologne meeting. GCC 8, Clang 7, 8 and 9 have it in <type_traits>while GCC 9+ and Clang 10+ have it in <bit>.

注:std::endian开始<type_traits>,但被转移<bit>在2019科隆会议。GCC 8、Clang 7、8 和 9 包含它,<type_traits>而 GCC 9+ 和 Clang 10+ 包含它<bit>

#include <bit>

if constexpr (std::endian::native == std::endian::big)
{
    // Big endian system
}
else if constexpr (std::endian::native == std::endian::little)
{
    // Little endian system
}
else
{
    // Something else
}

回答by bill

This is normally done at compile time (specially for performance reason) by using the header files available from the compiler or create your own. On linux you have the header file "/usr/include/endian.h"

这通常是在编译时(特别是出于性能原因)通过使用编译器提供的头文件或创建您自己的头文件来完成的。在 linux 上你有头文件“/usr/include/endian.h”

回答by Coriiander

Ehm... It surprises me that noone has realized that the compiler will simply optimize the test out, and will put a fixed result as return value. This renders all code examples above, effectively useless. The only thing that would be returned is the endianness at compile-time! And yes, I tested all of the above examples. Here's an example with MSVC 9.0 (Visual Studio 2008).

嗯...令我惊讶的是,没有人意识到编译器只会优化测试,并将固定结果作为返回值。这使得上面的所有代码示例实际上毫无用处。唯一会返回的是编译时的字节序!是的,我测试了上述所有示例。这是 MSVC 9.0 (Visual Studio 2008) 的示例。

Pure C code

纯C代码

int32 DNA_GetEndianness(void)
{
    union 
    {
        uint8  c[4];
        uint32 i;
    } u;

    u.i = 0x01020304;

    if (0x04 == u.c[0])
        return DNA_ENDIAN_LITTLE;
    else if (0x01 == u.c[0])
        return DNA_ENDIAN_BIG;
    else
        return DNA_ENDIAN_UNKNOWN;
}

Disassembly

拆卸

PUBLIC  _DNA_GetEndianness
; Function compile flags: /Ogtpy
; File c:\development\dna\source\libraries\dna\endian.c
;   COMDAT _DNA_GetEndianness
_TEXT   SEGMENT
_DNA_GetEndianness PROC                 ; COMDAT

; 11   :     union 
; 12   :     {
; 13   :         uint8  c[4];
; 14   :         uint32 i;
; 15   :     } u;
; 16   : 
; 17   :     u.i = 1;
; 18   : 
; 19   :     if (1 == u.c[0])
; 20   :         return DNA_ENDIAN_LITTLE;

    mov eax, 1

; 21   :     else if (1 == u.c[3])
; 22   :         return DNA_ENDIAN_BIG;
; 23   :     else
; 24   :        return DNA_ENDIAN_UNKNOWN;
; 25   : }

    ret
_DNA_GetEndianness ENDP
END

Perhaps it is possible to turn off ANY compile-time optimization for just this function, but I don't know. Otherwise it's maybe possible to hardcode it in assembly, although that's not portable. And even then even that might get optimized out. It makes me think I need some really crappy assembler, implement the same code for all existing CPUs/instruction sets, and well.... never mind.

也许可以仅针对此功能关闭任何编译时优化,但我不知道。否则有可能在汇编中对其进行硬编码,尽管这不是可移植的。即便如此,即使这样也可能会得到优化。这让我觉得我需要一些非常糟糕的汇编程序,为所有现有的 CPU/指令集实现相同的代码,以及......没关系。

Also, someone here said that endianness does not change during run-time. WRONG. There are bi-endian machines out there. Their endianness can vary durng execution. ALSO, there's not only Little Endian and Big Endian, but also other endiannesses (what a word).

另外,这里有人说字节序在运行时不会改变。错误的。那里有双端机器。它们的字节顺序在执行期间可能会有所不同。此外,不仅有 Little Endian 和 Big Endian,还有其他字节序(多多益善)。

I hate and love coding at the same time...

我既讨厌又喜欢编码...

回答by DaveR

I surprised no-one has mentioned the macros which the pre-processor defines by default. While these will vary depending on your platform; they are much cleaner than having to write your own endian-check.

我很惊讶没有人提到预处理器默认定义的宏。虽然这些会因您的平台而异;它们比必须编写自己的字节序检查要干净得多。

For example; if we look at the built-in macros which GCC defines (on an X86-64 machine):

例如; 如果我们查看 GCC 定义的内置宏(在 X86-64 机器上):

:| gcc -dM -E -x c - |grep -i endian
#define __LITTLE_ENDIAN__ 1

On a PPC machine I get:

在 PPC 机器上,我得到:

:| gcc -dM -E -x c - |grep -i endian
#define __BIG_ENDIAN__ 1
#define _BIG_ENDIAN 1

(The :| gcc -dM -E -x c -magic prints out all built-in macros).

:| gcc -dM -E -x c -魔术打印出所有内置宏)。

回答by sharptooth

Declare an int variable:

声明一个 int 变量:

int variable = 0xFF;

Now use char* pointers to various parts of it and check what is in those parts.

现在使用 char* 指针指向它的各个部分并检查这些部分中的内容。

char* startPart = reinterpret_cast<char*>( &variable );
char* endPart = reinterpret_cast<char*>( &variable ) + sizeof( int ) - 1;

Depending on which one points to 0xFF byte now you can detect endianness. This requires sizeof( int ) > sizeof( char ), but it's definitely true for the discussed platforms.

现在您可以根据哪一个指向 0xFF 字节来检测字节序。这需要 sizeof( int ) > sizeof( char ),但对于所讨论的平台来说绝对是正确的。

回答by none

For further details, you may want to check out this codeproject article Basic concepts on Endianness:

有关更多详细信息,您可能需要查看此代码项目文章 Endianness 的基本概念

How to dynamically test for the Endian type at run time?

As explained in Computer Animation FAQ, you can use the following function to see if your code is running on a Little- or Big-Endian system: Collapse

#define BIG_ENDIAN      0
#define LITTLE_ENDIAN   1

如何在运行时动态测试 Endian 类型?

如计算机动画常见问题解答中所述,您可以使用以下函数来查看您的代码是在 Little-Endian 还是 Big-Endian 系统上运行:

#define BIG_ENDIAN      0
#define LITTLE_ENDIAN   1
int TestByteOrder()
{
   short int word = 0x0001;
   char *byte = (char *) &word;
   return(byte[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

This code assigns the value 0001h to a 16-bit integer. A char pointer is then assigned to point at the first (least-significant) byte of the integer value. If the first byte of the integer is 0x01h, then the system is Little-Endian (the 0x01h is in the lowest, or least-significant, address). If it is 0x00h then the system is Big-Endian.

此代码将值 0001h 分配给一个 16 位整数。然后分配一个字符指针指向整数值的第一个(最低有效)字节。如果整数的第一个字节是 0x01h,则系统是 Little-Endian(0x01h 位于最低或最不重要的地址中)。如果是 0x00h,则系统是 Big-Endian。

回答by fuzzyTew

The C++ way has been to use boost, where preprocessor checks and casts are compartmentalized away inside very thoroughly-tested libraries.

C++ 的方式是使用boost,其中预处理器检查和强制转换被分隔在经过彻底测试的库中。

The Predef Library (boost/predef.h) recognizes four different kinds of endianness.

Predef 库 (boost/predef.h) 识别四种不同的字节顺序

The Endian Librarywas planned to be submitted to the C++ standard, and supports a wide variety of operations on endian-sensitive data.

尾数图书馆,计划将提交给C ++标准,并支持尾数敏感数据的各种操作。

As stated in answers above, Endianness will be a part of c++20.

如上面的答案所述,Endianness 将成为 c++20 的一部分。