C++中的位数组

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3806469/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 13:47:54  来源:igfitidea点击:

Bit Array in C++

c++arraysbit

提问by Seth

When working with Project Euler problems I often need large (> 10**7) bit array's.

在处理 Project Euler 问题时,我经常需要大 (> 10**7) 位数组。

My normal approach is one of:

我的正常方法是以下之一:

bool* sieve = new bool[N];

bool sieve[N];

When N = 1,000,000 my program uses 1 MegaByte (8 * 1,000,000 bits).

当 N = 1,000,000 时,我的程序使用 1 兆字节(8 * 1,000,000 位)。

Is there a more efficient way to use store bit arrays than bool in c++?

有没有比 c++ 中的 bool 更有效的方法来使用存储位数组?

回答by Prasoon Saurav

Use std::bitset(if Nis a constant) otherwise use std::vector<bool>as others have mentioned (but dont forget reading this excellent articleby Herb Sutter)

使用std::bitset(如果N是常数)否则使用std::vector<bool>其他人提到的(但不要忘记阅读Herb Sutter 的这篇优秀文章

A bitset is a special container class that is designed to store bits (elements with only two possible values: 0 or 1, true or false, ...).

The class is very similar to a regular array, but optimizing for space allocation: each element occupies only one bit (which is eight times less than the smallest elemental type in C++: char).

bitset 是一个特殊的容器类,旨在存储位(只有两个可能值的元素:0 或 1,真或假,...)。

该类与常规数组非常相似,但针对空间分配进行了优化:每个元素仅占用一位(比 C++ 中最小的元素类型:char 少八倍)。

EDIT:

编辑

Herb Sutter (in that article) mentions that

Herb Sutter(在那篇文章中)提到

The reason std::vector< bool > is nonconforming is that it pulls tricks under the covers in an attempt to optimize for space: Instead of storing a full char or int for every bool[1] (taking up at least 8 times the space, on platforms with 8-bit chars), it packs the bools and stores them as individual bits(inside, say, chars) in its internal representation.

std::vector < bool > forcesa specific optimization on all users by enshrining it in the standard. That's not a good idea; different users have different requirements, and now all users of vector must pay the performance penalty even if they don't want or need the space savings.

std::vector< bool > 不符合标准的原因是它隐藏了一些技巧以尝试优化空间:而不是为每个 bool[1] 存储一个完整的 char 或 int(占用至少 8 倍的空间) ,在具有 8 位字符的平台上),它将 bool 打包并在其内部表示中将它们存储为单独的位(内部,例如字符)。

std::vector < bool >通过将其纳入标准来强制对所有用户进行特定优化。这不是一个好主意。不同的用户有不同的要求,现在所有的vector用户即使不想或不需要节省空间,也必须付出性能代价。

EDIT 2:

编辑 2

And if you have used Boost you can use boost::dynamic_bitset(if Nis known at runtime)

如果您使用过 Boost,则可以使用boost::dynamic_bitset(如果N在运行时已知)

回答by GManNickG

For better or for worse, std::vector<bool>will use bits instead of bool's, to save space. So just use std::vectorlike you should have been in the first place.

无论好坏,std::vector<bool>都将使用位而不是布尔值,以节省空间。因此,只需std::vector像您一开始就应该使用的那样使用即可。

If Nis a constant, you can use std::bitset.

如果N是常量,则可以使用std::bitset.

回答by Jerry Coffin

You could look up std::bitsetand std::vector<bool>. The latter is often recommended against, because despite the vectorin the name, it doesn't really act like a vector of any other kind of object, and in fact doesn't meet the requirements for a container in general. Nonetheless, it can be pretty useful.

你可以查找std::bitsetstd::vector<bool>。后者经常被反对,因为尽管vector在名称中,它实际上并不像任何其他类型对象的向量,并且实际上不满足一般容器的要求。尽管如此,它可能非常有用。

OTOH, nothing is going to (at least dependably) store 1 million bool values in less than 1 million bits. It simply can't be done with any certainty. If your bit sets contain a degree of redundancy, there are various compression schemes that might be effective (e.g., LZ*, Huffman, arithmetic) but without some knowledge of the contents, it's impossible to say they would be for certain. Either of these will, however, normally store each bool/bit in only one bit of storage (plus a littleoverhead for bookkeeping -- but that's usually a constant, and on the order of bytes to tens of bytes at most).

OTOH,没有什么可以(至少可靠地)在不到 100 万位中存储 100 万个布尔值。它根本无法确定。如果您的位集包含一定程度的冗余,则有多种可能有效的压缩方案(例如,LZ*、Huffman、算术),但如果不了解内容,就无法确定它们会是有效的。然而,这些中的任何一个通常只会将每个 bool/bit 存储在一个存储位中(加上一些簿记开销——但这通常是一个常数,最多为字节到数十字节)。

回答by whooops

A 'bool' type isn't stored using only 1 bit. From your comment about the size, it seems to use 1 entire byte for each bool.

'bool' 类型不是仅使用 1 位存储的。从您对大小的评论来看,每个布尔值似乎使用 1 个完整字节。

A C like way of doing this would be:

AC像这样做的方式是:

uint8_t sieve[N/8]; //array of N/8 bytes

and then logical OR bytes together to get all your bits:

然后将逻辑 OR 字节放在一起以获得所有位:

sieve[0] = 0x01 | 0x02; //this would turn on the first two bits

In that example, 0x01 and 0x02 are hexadecimal numbers that represent bytes.

在该示例中,0x01 和 0x02 是表示字节的十六进制数。

回答by Niki Yoshiuchi

Yes, you can use a bitset.

是的,您可以使用bitset

回答by chesslover

You might be interested in trying the BITSCANlibrary as an alternative. Recently an extension has been proposed for sparseness, which I am not sure is your case, but might be.

您可能有兴趣尝试使用BITSCAN库作为替代方案。最近有人提出了稀疏的扩展,我不确定你的情况,但可能是。

回答by Vladimir

A 'bool' type isn't stored using only 1 bit. From your comment about the size, it seems to use 1 entire byte for each bool.

'bool' 类型不是仅使用 1 位存储的。从您对大小的评论来看,每个布尔值似乎使用 1 个完整字节。

A C like way of doing this would be:

AC像这样做的方式是:

uint8_t sieve[N/8]; //array of N/8 bytes

element of array is:

数组元素为:

result = sieve[index / 8] || (1 << (index % 8)); 

or

或者

result = sieve[index >> 3] || (1 << (index & 7));

set 1 in array:

在数组中设置 1:

sieve[index >> 3] |= 1 << (index & 7);

回答by Dima

回答by driis

You can use a byte array and index into that. Index nwould be in byte index n/8, bit # n%8. (In case std::bitset is not available for some reason).

您可以使用字节数组并对其进行索引。索引n将在字节索引中n/8,位# n%8。(如果 std::bitset 由于某种原因不可用)。

回答by Eugen Constantin Dinca

If N is known at compile time, use std::bitset, otherwise use boost::dynamic_bitset.

如果 N 在编译时已知,则使用std::bitset,否则使用boost::dynamic_bitset