C++ 什么时候值得使用位域?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4240974/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 14:55:37  来源:igfitidea点击:

When is it worthwhile to use bit fields?

c++cbit-fields

提问by Russel

Is it worthwhile using C's bit-field implementation? If so, when is it ever used?

是否值得使用 C 的位域实现?如果是这样,它什么时候使用过?

I was looking through some emulator code and it looks like the registers for the chips are not being implemented using bit fields.

我正在查看一些仿真器代码,看起来芯片的寄存器没有使用位字段实现。

Is this something that is avoided for performance reasons (or some other reason)?

这是出于性能原因(或其他原因)而避免的吗?

Are there still times when bit-fields are used? (ie firmware to put on actual chips, etc)

是否还有使用位域的时候?(即安装在实际芯片上的固件等)

回答by Oliver Charlesworth

Bit-fields are typically only used when there's a need to map structure fields to specific bit slices, where some hardware will be interpreting the raw bits. An example might be assembling an IP packet header. I can't see a compelling reason for an emulator to model a register using bit-fields, as it's never going to touch real hardware!

位域通常仅在需要将结构域映射到特定位片时使用,其中一些硬件将解释原始位。一个例子可能是组装 IP 数据包头。我看不出有什么令人信服的理由让模拟器使用位域对寄存器进行建模,因为它永远不会触及真正的硬件!

Whilst bit-fields can lead to neat syntax, they're pretty platform-dependent, and therefore non-portable. A more portable, but yet more verbose, approach is to use direct bitwise manipulation, using shifts and bit-masks.

虽然位域可以导致简洁的语法,但它们非常依赖于平台,因此不可移植。一种更便携但更冗长的方法是使用直接按位操作,使用移位和位掩码。

If you use bit-fields for anything other than assembling (or disassembling) structures at some physical interface, performance may suffer. This is because every time you read or write from a bit-field, the compiler will have to generate code to do the masking and shifting, which will burn cycles.

如果在某些物理接口上将位字段用于组装(或反汇编)结构以外的任何其他内容,则性能可能会受到影响。这是因为每次从位域读取或写入时,编译器都必须生成代码来进行屏蔽和移位,这会消耗周期。

回答by caf

One use for bitfields which hasn't yet been mentioned is that unsignedbitfields provide arithmetic modulo a power-of-two "for free". For example, given:

尚未提及的位域的一种用途是位域unsigned“免费”提供对二的幂模数的算术模数。例如,给定:

struct { unsigned x:10; } foo;

arithmetic on foo.xwill be performed modulo 210= 1024.

foo.x将执行模 2 10= 1024 的算术运算。

(The same can be achieved directly by using bitwise &operations, of course - but sometimes it might lead to clearer code to have the compiler do it for you).

(当然,同样可以通过使用按位&运算直接实现- 但有时它可能会导致更清晰的代码让编译器为您完成)。

回答by Tony Delroy

FWIW, and looking only at the relative performance question - a bodgy benchmark:

FWIW,只看相对性能问题 - 一个笨拙的基准:

#include <time.h>
#include <iostream>

struct A
{
    void a(unsigned n) { a_ = n; }
    void b(unsigned n) { b_ = n; }
    void c(unsigned n) { c_ = n; }
    void d(unsigned n) { d_ = n; }
    unsigned a() { return a_; }
    unsigned b() { return b_; }
    unsigned c() { return c_; }
    unsigned d() { return d_; }
    volatile unsigned a_:1,
                      b_:5,
                      c_:2,
                      d_:8;
};

struct B
{
    void a(unsigned n) { a_ = n; }
    void b(unsigned n) { b_ = n; }
    void c(unsigned n) { c_ = n; }
    void d(unsigned n) { d_ = n; }
    unsigned a() { return a_; }
    unsigned b() { return b_; }
    unsigned c() { return c_; }
    unsigned d() { return d_; }
    volatile unsigned a_, b_, c_, d_;
};

struct C
{
    void a(unsigned n) { x_ &= ~0x01; x_ |= n; }
    void b(unsigned n) { x_ &= ~0x3E; x_ |= n << 1; }
    void c(unsigned n) { x_ &= ~0xC0; x_ |= n << 6; }
    void d(unsigned n) { x_ &= ~0xFF00; x_ |= n << 8; }
    unsigned a() const { return x_ & 0x01; }
    unsigned b() const { return (x_ & 0x3E) >> 1; }
    unsigned c() const { return (x_ & 0xC0) >> 6; }
    unsigned d() const { return (x_ & 0xFF00) >> 8; }
    volatile unsigned x_;
};

struct Timer
{
    Timer() { get(&start_tp); }
    double elapsed() const {
        struct timespec end_tp;
        get(&end_tp);
        return (end_tp.tv_sec - start_tp.tv_sec) +
               (1E-9 * end_tp.tv_nsec - 1E-9 * start_tp.tv_nsec);
    }
  private:
    static void get(struct timespec* p_tp) {
        if (clock_gettime(CLOCK_REALTIME, p_tp) != 0)
        {
            std::cerr << "clock_gettime() error\n";
            exit(EXIT_FAILURE);
        }
    }
    struct timespec start_tp;
};

template <typename T>
unsigned f()
{
    int n = 0;
    Timer timer;
    T t;
    for (int i = 0; i < 10000000; ++i)
    {
        t.a(i & 0x01);
        t.b(i & 0x1F);
        t.c(i & 0x03);
        t.d(i & 0xFF);
        n += t.a() + t.b() + t.c() + t.d();
    }
    std::cout << timer.elapsed() << '\n';
    return n;
}

int main()
{
    std::cout << "bitfields: " << f<A>() << '\n';
    std::cout << "separate ints: " << f<B>() << '\n';
    std::cout << "explicit and/or/shift: " << f<C>() << '\n';
}

Output on my test machine (numbers vary by ~20% run to run):

我的测试机器上的输出(运行到运行的数字相差约 20%):

bitfields: 0.140586
1449991808
separate ints: 0.039374
1449991808
explicit and/or/shift: 0.252723
1449991808

Suggests that with g++ -O3 on a pretty recent Athlon, bitfields are worse than a few times slower than separate ints, and this particular and/or/bitshift implementation's at least twice as bad again ("worse" as other operations like memory read/writes are emphasised by the volatility above, and there's loop overhead etc, so the differences are understated in the results).

建议在最近的 Athlon 上使用 g++ -O3,位域比单独的整数慢几倍还要糟糕,并且这个特定的和/或/bitshift 实现至少再次糟糕两倍(“更糟糕”为其他操作,如内存读取/上面的波动性强调了写入,并且存在循环开销等,因此结果中的差异被低估了)。

If you're dealing in hundreds of megabytes of structs that can be mainly bitfields or mainly distinct ints, the caching issues may become dominant - so benchmark in your system.

如果您要处理数百兆字节的结构,这些结构可能主要是位域或主要是不同的整数,则缓存问题可能会占主导地位 - 因此在您的系统中进行基准测试。



UPDATE: user2188211 attempted an edit which was rejected but usefully illustrated how bitfields become faster as the amount of data increases: "when iterating over a vector of a few million elements in [a modified version of] the above code, such that the variables do not reside in cache or registers, the bitfield code may be the fastest."

更新:user2188211 尝试了一个被拒绝的编辑,但有用地说明了随着数据量的增加,位域如何变得更快:“当在上述代码的[修改版本]中迭代数百万个元素的向量时,变量不在缓存或寄存器中,位域代码可能是最快的。”

template <typename T>
unsigned f()
{
    int n = 0;
    Timer timer;
    std::vector<T> ts(1024 * 1024 * 16);
    for (size_t i = 0, idx = 0; i < 10000000; ++i)
    {
        T& t = ts[idx];
        t.a(i & 0x01);
        t.b(i & 0x1F);
        t.c(i & 0x03);
        t.d(i & 0xFF);
        n += t.a() + t.b() + t.c() + t.d();
        idx++;
        if (idx >= ts.size()) {
            idx = 0;
        }
    }
    std::cout << timer.elapsed() << '\n';
    return n;
}

Results on from an example run (g++ -03, Core2Duo):

示例运行的结果(g++ -03,Core2Duo):

 0.19016
 bitfields: 1449991808
 0.342756
 separate ints: 1449991808
 0.215243
 explicit and/or/shift: 1449991808


Of course, timing's all relative and which way you implement these fields may not matter at all in the context of your system.

当然,时间是相对的,以及您实现这些字段的方式在您的系统上下文中可能根本无关紧要。

回答by uesp

I've seen/used bit fields in two situations: Computer Games and Hardware Interfaces. The hardware use is pretty straight forward: the hardware expects data in a certain bit format you can either define manually or through pre-defined library structures. It depends on the specific library whether they use bit fields or just bit manipulation.

我在两种情况下看到/使用过位域:计算机游戏和硬件接口。硬件的使用非常简单:硬件需要某种位格式的数据,您可以手动定义或通过预定义的库结构。它们是使用位字段还是仅使用位操作取决于特定的库。

In the "old days" computers games used bit fields frequently to make the most use of computer/disk memory as possible. For example, for a NPC definition in a RPG you might find (made up example):

在“旧时代”的计算机游戏中,经常使用位域来尽可能充分地利用计算机/磁盘内存。例如,对于 RPG 中的 NPC 定义,您可能会发现(虚构示例):

struct charinfo_t
{
     unsigned int Strength : 7;  // 0-100
     unsigned int Agility : 7;  
     unsigned int Endurance: 7;  
     unsigned int Speed : 7;  
     unsigned int Charisma : 7;  
     unsigned int HitPoints : 10;    //0-1000
     unsigned int MaxHitPoints : 10;  
     //etc...
};

You don't see it so much in more modern games/software as the space savings has gotten proportionally worse as computers get more memory. Saving a 1MB of memory when your computer only has 16MB is a big deal but not so much when you have 4GB.

您在更现代的游戏/软件中看不到它,因为随着计算机获得更多内存,空间节省也成比例地变得更糟。当您的计算机只有 16MB 时,节省 1MB 的内存是一件大事,但当您有 4GB 时,则不是那么重要。

回答by AnT

The primary purpose of bit-fields is to provide a way to save memory in massively instantiated aggregate data structures by achieving tighter packing of data.

位域的主要目的是通过实现更紧密的数据打包,提供一种在大规模实例化聚合数据结构中节省内存的方法。

The whole idea is to take advantage of situations where you have several fields in some struct type, which don't need the entire width (and range) of some standard data type. This provides you with the opportunity to pack several of such fields in one allocation unit, thus reducing the overall size of the struct type. And extreme example would be boolean fields, which can be represented by individual bits (with, say, 32 of them being packable into a single unsigned intallocation unit).

整个想法是利用某些结构类型中有多个字段的情况,这些字段不需要某些标准数据类型的整个宽度(和范围)。这为您提供了将多个此类字段打包到一个分配单元中的机会,从而减少了结构类型的整体大小。极端的例子是布尔字段,它可以由单个位表示(例如,其中的 32 个可以打包到单个unsigned int分配单元中)。

Obviously, this only makes sense in situation where the pros of the reduced memory consumption outweigh the cons of slower access to values stored in bit-fields. However, such situations arise quite often, which makes bit-fields an absolutely indispensable language feature. This should answer your question about the modern use of bit-fields: not only they are used, they are essentially mandatory in any practically meaningful code oriented on processing large amounts of homogeneous data (like large graphs, for one example), because their memory-saving benefits greatly outweigh any individual-access performance penalties.

显然,这仅在减少内存消耗的优点超过对存储在位字段中的值的较慢访问的缺点的情况下才有意义。然而,这种情况经常出现,这使得位域成为绝对不可缺少的语言特性。这应该可以回答您关于位域现代使用的问题:它们不仅被使用,而且在面向处理大量同构数据(例如大图)的任何实际有意义的代码中基本上都是强制性的,因为它们的内存- 节省的好处大大超过了任何个人访问性能的损失。

In a way, bit-fields in their purpose are very similar to such things as "small" arithmetic types: signed/unsigned char, short, float. In the actual data-processing code one would not normally use any types smaller than intor double(with few exceptions). Arithmetic types like signed/unsigned char, short, floatexist just to serve as "storage" types: as memory-saving compact members of struct types in situations where their range (or precision) is known to be sufficient. Bit-fields is just another step in the same direction, that trades a bit more performance for much greater memory-saving benefits.

在某种程度上,位域的用途与诸如“小”算术类型之类的东西非常相似:signed/unsigned char, short, float。在实际的数据处理代码中,通常不会使用任何小于intor 的类型double(除了少数例外)。算术类型喜欢signed/unsigned charshortfloat存在只是充当“存储”的类型:为存储器节省结构类型的紧凑成员在他们的范围(或精度)已知是足够的情况。位域只是朝着同一方向迈出的又一步,它以更高的性能换取更大的内存节省优势。

So, that gives us a rather clear set of conditions under which it is worthwhile to employ bit-fields:

因此,这为我们提供了一组相当清晰的条件,在这些条件下,使用位域是值得的:

  1. Struct type contains multiplefields that can be packed into a smaller number of bits.
  2. The program instantiates a large number of objects of that struct type.
  1. 结构类型包含多个字段,这些字段可以打包成更少的位。
  2. 该程序实例化了大量该结构类型的对象。

If the conditions are met, you declare all bit-packable fields contiguously (typically at the end of the struct type), assign them their appropriate bit-widths (and, usually, take some steps to ensure that the bit-widths are appropriate). In most cases it makes sense to play around with ordering of these fields to achieve the best packing and/or performance.

如果满足条件,则连续声明所有位可打包字段(通常在结构类型的末尾),为它们分配适当的位宽(并且通常采取一些步骤来确保位宽合适) . 在大多数情况下,对这些字段进行排序以实现最佳打包和/或性能是有意义的。



There's also a weird secondary use of bit-fields: using them for mapping bit groups in various externally-specified representations, like hardware registers, floating-point formats, file formats etc. This has never been intended as a proper use of bit-fields, even though for some unexplained reason this kind of bit-field abuse continues to pop-up in real-life code. Just don't do this.

位域还有一个奇怪的二次使用:使用它们来映射各种外部指定的表示中的位组,如硬件寄存器、浮点格式、文件格式等。 这从来不是位域的正确使用,即使出于某种无法解释的原因,这种位域滥用继续在现实生活中的代码中弹出。不要这样做。

回答by nate c

Bit fields were used in the olden daysto save program memory.

位字段是在用昔日保存程序存储器。

They degrade performance because registers can not work with them so they have to be converted to integers to do anything with them. They tend to lead to more complex code that is unportable and harder to understand (since you have to mask and unmask things all the time to actually use the values.)

它们会降低性能,因为寄存器无法使用它们,因此必须将它们转换为整数才能对它们执行任何操作。它们往往会导致更复杂的代码,这些代码不可移植且更难理解(因为您必须一直屏蔽和取消屏蔽才能实际使用这些值。)

Check out the source for http://www.nethack.org/to see pre ansi c in all its bitfield glory!

查看http://www.nethack.org/的源代码,以查看预分析的所有位域荣耀!

回答by zwol

In modern code, there's really only one reason to use bitfields: to control the space requirements of a boolor an enumtype, within a struct/class. For instance (C++):

在现代代码中,使用位域实际上只有一个原因:在结构/类中控制一个bool或一个enum类型的空间要求。例如(C++):

enum token_code { TK_a, TK_b, TK_c, ... /* less than 255 codes */ };
struct token {
    token_code code      : 8;
    bool number_unsigned : 1;
    bool is_keyword      : 1;
    /* etc */
};

IMO there's basically no reason not to use :1bitfields for bool, as modern compilers will generate very efficient code for it. In C, though, make sure your booltypedef is either the C99 _Boolor failing that an unsignedint, because a signed 1-bit field can hold only the values 0and -1(unless you somehow have a non-twos-complement machine).

IMO 基本上没有理由不为 使用:1位域bool,因为现代编译器将为它生成非常有效的代码。但是,在 C 中,请确保您的booltypedef 是 C99_Bool或失败的unsignedint,因为有符号的 1 位字段只能保存值0-1(除非您以某种方式拥有非二进制补码机器)。

With enumeration types, always use a size that corresponds to the size of one of the primitive integer types (8/16/32/64 bits, on normal CPUs) to avoid inefficient code generation (repeated read-modify-write cycles, usually).

对于枚举类型,始终使用与原始整数类型之一(8/16/32/64 位,在普通 CPU 上)的大小相对应的大小,以避免低效的代码生成(通常是重复的读-修改-写循环) .

Using bitfields to line up a structure with some externally-defined data format (packet headers, memory-mapped I/O registers) is commonly suggested, but I actually consider it a bad practice, because C doesn't give you enough control over endianness, padding, and (for I/O regs) exactly what assembly sequences get emitted. Have a look at Ada's representation clauses sometime if you want to see how much C is missing in this area.

通常建议使用位域将结构与一些外部定义的数据格式(数据包头、内存映射 I/O 寄存器)对齐,但我实际上认为这是一种不好的做法,因为 C 没有给你足够的字节序控制、填充和(对于 I/O regs)确切地发出什么汇编序列。如果您想了解该区域缺少多少 C,请查看 Ada 的表示条款。

回答by sizzzzlerz

One use for bit fields used to be to mirror hardware registers when writing embedded code. However, since the bit order is platform-dependent, they don't work if the hardware orders its bits different from the processor. That said, I can't think of a use for bit fields any more. You're better off implementing a bit manipulation library that can be ported across platforms.

位域的一种用途是在编写嵌入式代码时镜像硬件寄存器。然而,由于位顺序是平台相关的,如果硬件的位顺序与处理器不同,它们就不起作用。也就是说,我想不出位域的用途了。你最好实现一个可以跨平台移植的位操作库。

回答by EvilTeach

In the 70s I used bit fields to control hardware on a trs80. The display/keyboard/cassette/disks were all memory mapped devices. Individual bits controlled various things.

在 70 年代,我使用位域来控制 trs80 上的硬件。显示器/键盘/磁带/磁盘都是内存映射设备。各个位控制着各种事情。

  1. A bit controlled 32 column vs 64 column display.
  2. Bit 0 in that same memory cell was the cassette serial data in/out.
  1. 位控制 32 列 vs 64 列显示。
  2. 同一个存储单元中的位 0 是卡带串行数据输入/输出。

As I recall, the disk drive control had a number of them. There were 4 bytes in total. I think there was a 2 bit drive select. But it was a long time ago. It was kind of impressive back then in that there were at least two different c compilers for the plantform.

我记得,磁盘驱动器控制有很多。总共有4个字节。我认为有一个 2 位驱动器选择。但那是很久以前的事了。当时令人印象深刻的是,至少有两种不同的 C 编译器用于 Plantform。

The other observation is that bit fields really are platform specific. There is no expectation that a program with bit fields should port to another platform.

另一个观察结果是位字段确实是特定于平台的。没有期望具有位字段的程序应该移植到另一个平台。

回答by Steve Townsend

Boost.Thread uses bitfields in its shared_mutex, on Windows at least:

Boost.Thread 在其shared_mutex, 至少在 Windows 上使用位域:

    struct state_data
    {
        unsigned shared_count:11,
        shared_waiting:11,
        exclusive:1,
        upgrade:1,
        exclusive_waiting:7,
        exclusive_waiting_blocked:1;
    };