C语言 如何在 C 中定义和使用位数组?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2525310/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to define and work with an array of bits in C?
提问by Eddy
I want to create a very large array on which I write '0's and '1's. I'm trying to simulate a physical process called random sequential adsorption, where units of length 2, dimers, are deposited onto an n-dimensional lattice at a random location, without overlapping each other. The process stops when there is no more room left on the lattice for depositing more dimers (lattice is jammed).
我想创建一个非常大的数组,在上面写上“0”和“1”。我试图模拟一个称为随机顺序吸附的物理过程,其中长度为 2 的二聚体单位在随机位置沉积到 n 维晶格上,彼此不重叠。当晶格上没有更多空间用于沉积更多二聚体(晶格被堵塞)时,该过程停止。
Initially I start with a lattice of zeroes, and the dimers are represented by a pair of '1's. As each dimer is deposited, the site on the left of the dimer is blocked, due to the fact that the dimers cannot overlap. So I simulate this process by depositing a triple of '1's on the lattice. I need to repeat the entire simulation a large number of times and then work out the average coverage %.
最初我从零点阵开始,二聚体由一对“1”表示。随着每个二聚体的沉积,二聚体左侧的位点被封闭,因为二聚体不能重叠。所以我通过在晶格上放置三重“1”来模拟这个过程。我需要多次重复整个模拟,然后计算平均覆盖率。
I've already done this using an array of chars for 1D and 2D lattices. At the moment I'm trying to make the code as efficient as possible, before working on the 3D problem and more complicated generalisations.
我已经使用一维和二维晶格的字符数组完成了此操作。目前,在处理 3D 问题和更复杂的概括之前,我正在尝试使代码尽可能高效。
This is basically what the code looks like in 1D, simplified:
这基本上是一维代码的样子,简化了:
int main()
{
/* Define lattice */
array = (char*)malloc(N * sizeof(char));
total_c = 0;
/* Carry out RSA multiple times */
for (i = 0; i < 1000; i++)
rand_seq_ads();
/* Calculate average coverage efficiency at jamming */
printf("coverage efficiency = %lf", total_c/1000);
return 0;
}
void rand_seq_ads()
{
/* Initialise array, initial conditions */
memset(a, 0, N * sizeof(char));
available_sites = N;
count = 0;
/* While the lattice still has enough room... */
while(available_sites != 0)
{
/* Generate random site location */
x = rand();
/* Deposit dimer (if site is available) */
if(array[x] == 0)
{
array[x] = 1;
array[x+1] = 1;
count += 1;
available_sites += -2;
}
/* Mark site left of dimer as unavailable (if its empty) */
if(array[x-1] == 0)
{
array[x-1] = 1;
available_sites += -1;
}
}
/* Calculate coverage %, and add to total */
c = count/N
total_c += c;
}
For the actual project I'm doing, it involves not just dimers but trimers, quadrimers, and all sorts of shapes and sizes (for 2D and 3D).
对于我正在做的实际项目,它不仅涉及二聚体,还涉及三聚体、四聚体以及各种形状和大小(用于 2D 和 3D)。
I was hoping that I would be able to work with individual bits instead of bytes, but I've been reading around and as far as I can tell you can only change 1 byte at a time, so either I need to do some complicated indexing or there is a simpler way to do it?
我希望我能够处理单个位而不是字节,但我一直在阅读,据我所知,一次只能更改 1 个字节,所以要么我需要做一些复杂的索引或者有更简单的方法来做到这一点?
Thanks for your answers
谢谢你的回答
回答by aniliitb10
If I am not too late, thispage gives awesome explanation with examples.
如果我还不算太晚,这个页面给出了很棒的例子解释。
An array of intcan be used to deal with array of bits. Assuming size of intto be 4 bytes, when we talk about an int, we are dealing with 32 bits. Say we have int A[10], means we are working on 10*4*8 = 320 bitsand following figure shows it: (each element of array has 4 big blocks, each of which represent a byteand each of the smaller blocks represent a bit)
的数组int可用于处理数组bits。假设大小int为 be 4 bytes,当我们谈论 an 时int,我们正在处理32 bits。假设我们有int A[10],表示我们正在处理10*4*8 = 320 bits,下图显示了它:(数组的每个元素有 4 个大块,每个块代表 a byte,每个较小的块代表 a bit)


So, to set the kth bit in array A:
因此,要设置k数组中的第 th 位A:
void SetBit( int A[], int k )
{
int i = k/32; //gives the corresponding index in the array A
int pos = k%32; //gives the corresponding bit position in A[i]
unsigned int flag = 1; // flag = 0000.....00001
flag = flag << pos; // flag = 0000...010...000 (shifted k positions)
A[i] = A[i] | flag; // Set the bit at the k-th position in A[i]
}
or in the shortened version
或在缩短版本中
void SetBit( int A[], int k )
{
A[k/32] |= 1 << (k%32); // Set the bit at the k-th position in A[i]
}
similarly to clear kth bit:
类似于清除k第位:
void ClearBit( int A[], int k )
{
A[k/32] &= ~(1 << (k%32));
}
and to test if the kth bit:
并测试是否第kth 位:
int TestBit( int A[], int k )
{
return ( (A[k/32] & (1 << (k%32) )) != 0 ) ;
}
As said above, these manipulations can be written as macros too:
如上所述,这些操作也可以写成宏:
#define SetBit(A,k) ( A[(k/32)] |= (1 << (k%32)) )
#define ClearBit(A,k) ( A[(k/32)] &= ~(1 << (k%32)) )
#define TestBit(A,k) ( A[(k/32)] & (1 << (k%32)) )
回答by nategoose
typedef unsigned long bfield_t[ size_needed/sizeof(long) ];
// long because that's probably what your cpu is best at
// The size_needed should be evenly divisable by sizeof(long) or
// you could (sizeof(long)-1+size_needed)/sizeof(long) to force it to round up
Now, each long in a bfield_t can hold sizeof(long)*8 bits.
现在,bfield_t 中的每个 long 可以容纳 sizeof(long)*8 位。
You can calculate the index of a needed big by:
您可以通过以下方式计算所需大的索引:
bindex = index / (8 * sizeof(long) );
and your bit number by
和你的位数
b = index % (8 * sizeof(long) );
You can then look up the long you need and then mask out the bit you need from it.
然后,您可以查找所需的长度,然后从中屏蔽掉所需的位。
result = my_field[bindex] & (1<<b);
or
或者
result = 1 & (my_field[bindex]>>b); // if you prefer them to be in bit0
The first one may be faster on some cpus or may save you shifting back up of you need to perform operations between the same bit in multiple bit arrays. It also mirrors the setting and clearing of a bit in the field more closely than the second implemention. set:
第一个可能在某些 cpu 上更快,或者可以节省您在多个位数组中的同一位之间执行操作所需的后移。它还比第二个实现更密切地反映了现场位的设置和清除。放:
my_field[bindex] |= 1<<b;
clear:
清除:
my_field[bindex] &= ~(1<<b);
You should remember that you can use bitwise operations on the longs that hold the fields and that's the same as the operations on the individual bits.
您应该记住,您可以对包含字段的 long 使用按位运算,这与对单个位的运算相同。
You'll probably also want to look into the ffs, fls, ffc, and flc functions if available. ffs should always be avaiable in strings.h. It's there just for this purpose -- a string of bits.
Anyway, it is find first set and essentially:
如果可用,您可能还想查看 ffs、fls、ffc 和 flc 函数。ffs 应始终在strings.h. 它只是为了这个目的——一串位。无论如何,它是先发现集,本质上是:
int ffs(int x) {
int c = 0;
while (!(x&1) ) {
c++;
x>>=1;
}
return c; // except that it handles x = 0 differently
}
This is a common operation for processors to have an instruction for and your compiler will probably generate that instruction rather than calling a function like the one I wrote. x86 has an instruction for this, by the way. Oh, and ffsl and ffsll are the same function except take long and long long, respectively.
这是处理器具有指令的常见操作,您的编译器可能会生成该指令,而不是像我编写的那样调用函数。顺便说一下,x86 对此有一个说明。哦,ffsl 和 ffsll 是相同的功能,除了分别是 take long 和 long long。
回答by David
You can use & (bitwise and) and << (left shift).
您可以使用 &(按位与)和 <<(左移)。
For example, (1 << 3) results in "00001000" in binary. So your code could look like:
例如, (1 << 3) 结果为二进制的“00001000”。所以你的代码可能看起来像:
char eightBits = 0;
//Set the 5th and 6th bits from the right to 1
eightBits &= (1 << 4);
eightBits &= (1 << 5);
//eightBits now looks like "00110000".
Then just scale it up with an array of chars and figure out the appropriate byte to modify first.
然后只需使用字符数组将其放大并找出要首先修改的适当字节。
For more efficiency, you could define a list of bitfields in advance and put them in an array:
为了提高效率,你可以提前定义一个位域列表并将它们放在一个数组中:
#define BIT8 0x01
#define BIT7 0x02
#define BIT6 0x04
#define BIT5 0x08
#define BIT4 0x10
#define BIT3 0x20
#define BIT2 0x40
#define BIT1 0x80
char bits[8] = {BIT1, BIT2, BIT3, BIT4, BIT5, BIT6, BIT7, BIT8};
Then you avoid the overhead of the bit shifting and you can index your bits, turning the previous code into:
然后你避免了位移的开销,你可以索引你的位,把前面的代码变成:
eightBits &= (bits[3] & bits[4]);
Alternatively, if you can use C++, you could just use an std::vector<bool>which is internally defined as a vector of bits, complete with direct indexing.
或者,如果您可以使用 C++,您可以只使用std::vector<bool>内部定义为位向量的an ,并带有直接索引。
回答by 18446744073709551615
bitarray.h:
位数组.h:
#include <inttypes.h> // defines uint32_t
//typedef unsigned int bitarray_t; // if you know that int is 32 bits
typedef uint32_t bitarray_t;
#define RESERVE_BITS(n) (((n)+0x1f)>>5)
#define DW_INDEX(x) ((x)>>5)
#define BIT_INDEX(x) ((x)&0x1f)
#define getbit(array,index) (((array)[DW_INDEX(index)]>>BIT_INDEX(index))&1)
#define putbit(array, index, bit) \
((bit)&1 ? ((array)[DW_INDEX(index)] |= 1<<BIT_INDEX(index)) \
: ((array)[DW_INDEX(index)] &= ~(1<<BIT_INDEX(index))) \
, 0 \
)
Use:
用:
bitarray_t arr[RESERVE_BITS(130)] = {0, 0x12345678,0xabcdef0,0xffff0000,0};
int i = getbit(arr,5);
putbit(arr,6,1);
int x=2; // the least significant bit is 0
putbit(arr,6,x); // sets bit 6 to 0 because 2&1 is 0
putbit(arr,6,!!x); // sets bit 6 to 1 because !!2 is 1
EDIT the docs:
编辑文档:
"dword" = "double word" = 32-bit value (unsigned, but that's not really important)
"dword" = "double word" = 32 位值(无符号,但这并不重要)
RESERVE_BITS: number_of_bits --> number_of_dwords
RESERVE_BITS(n) is the number of 32-bit integers enough to store n bits
DW_INDEX: bit_index_in_array --> dword_index_in_array
DW_INDEX(i) is the index of dword where the i-th bit is stored.
Both bit and dword indexes start from 0.
BIT_INDEX: bit_index_in_array --> bit_index_in_dword
If i is the number of some bit in the array, BIT_INDEX(i) is the number
of that bit in the dword where the bit is stored.
And the dword is known via DW_INDEX().
getbit: bit_array, bit_index_in_array --> bit_value
putbit: bit_array, bit_index_in_array, bit_value --> 0
getbit(array,i)fetches the dword containing the bit i and shiftsthe dword right, so that the bit i becomes the least significant bit. Then, a bitwise andwith 1 clears all other bits.
getbit(array,i)取包含比特i和双字移位的DWORD右,使得位i变为最低显著位。然后,按位和1 清除所有其他位。
putbit(array, i, v)first of all checks the least significant bit of v; if it is 0, we have to clear the bit, and if it is 1, we have to set it.
To set the bit, we do a bitwise orof the dword that contains the bit and the value of 1 shifted leftby bit_index_in_dword: that bit is set, and other bits do not change.
To clear the bit, we do a bitwise andof the dword that contains the bit and the bitwise complementof 1 shifted leftby bit_index_in_dword: that value has all bits set to one except the only zero bit in the position that we want to clear.
The macro ends with , 0because otherwise it would return the value of dword where the bit i is stored, and that value is not meaningful. One could also use ((void)0).
putbit(array, i, v)首先检查 v 的最低有效位;如果它是0,我们必须清除该位,如果它是1,我们必须设置它。
要设置该位,我们对包含该位的 dword进行按位 or 操作,并将 1 的值左移bit_index_in_dword:该位已设置,其他位不变。
为了清除该位,我们对包含该位的 dword执行按位和操作,并将1的按位补码左移bit_index_in_dword:该值将所有位设置为 1,除了我们要清除的位置中唯一的零位。
宏结束于, 0因为否则它将返回存储位 i 的 dword 值,并且该值没有意义。还可以使用((void)0).
回答by Paul R
It's a trade-off:
这是一个权衡:
(1) use 1 byte for each 2 bit value - simple, fast, but uses 4x memory
(1) 每 2 位值使用 1 个字节 - 简单、快速,但使用 4x 内存
(2) pack bits into bytes - more complex, some performance overhead, uses minimum memory
(2) 将位打包成字节——更复杂,一些性能开销,使用最少的内存
If you have enough memory available then go for (1), otherwise consider (2).
如果您有足够的可用内存,则选择 (1),否则考虑 (2)。

