C语言 计算快速对数基数 2 上限
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3272424/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Compute fast log base 2 ceiling
提问by kevinlawler
What is a fast way to compute the (long int) ceiling(log_2(i)), where the input and output are 64-bit integers? Solutions for signed or unsigned integers are acceptable. I suspect the best way will be a bit-twiddling method similar to those found here, but rather than attempt my own I would like to use something that is already well tested. A general solution will work for all positive values.
什么是计算 的快速方法(long int) ceiling(log_2(i)),其中输入和输出是 64 位整数?有符号或无符号整数的解是可以接受的。我怀疑最好的方法是使用类似于此处找到的方法的有点麻烦的方法,但与其尝试我自己的方法,不如使用已经经过充分测试的方法。通用解决方案适用于所有正值。
For instance, the values for 2,3,4,5,6,7,8 are 1,2,2,3,3,3,3
例如,2,3,4,5,6,7,8 的值为 1,2,2,3,3,3,3
Edit: So far the best route seems to be to compute the integer/floor log base 2 (the position of the MSB) using any number of fast existing bithacks or register methods, and then to add one if the input is not a power of two. The fast bitwise check for powers of two is (n&(n-1)).
编辑:到目前为止,最好的方法似乎是使用任意数量的快速现有 bithacks 或寄存器方法计算整数/地板对数基数 2(MSB 的位置),然后如果输入不是二。对 2 的幂的快速按位检查是(n&(n-1))。
Edit 2: A good source on integer logarithms and leading zeroes methods is Sections 5-3 and 11-4 in Hacker's Delightby Henry S. Warren. This is the most complete treatment I've found.
编辑 2:关于整数对数和前导零方法的一个很好的来源是Henry S. Warren在Hacker's Delight 中的第 5-3 节和第 11-4 节。这是我找到的最完整的治疗方法。
Edit 3: This technique looks promising: https://stackoverflow.com/a/51351885/365478
编辑 3:这种技术看起来很有希望:https: //stackoverflow.com/a/51351885/365478
采纳答案by dgobbi
This algorithm has already been posted, but the following implementation is very compact and should optimize into branch-free code.
这个算法已经贴出来了,但是下面的实现非常紧凑,应该优化成无分支代码。
int ceil_log2(unsigned long long x)
{
static const unsigned long long t[6] = {
0xFFFFFFFF00000000ull,
0x00000000FFFF0000ull,
0x000000000000FF00ull,
0x00000000000000F0ull,
0x000000000000000Cull,
0x0000000000000002ull
};
int y = (((x & (x - 1)) == 0) ? 0 : 1);
int j = 32;
int i;
for (i = 0; i < 6; i++) {
int k = (((x & t[i]) == 0) ? 0 : j);
y += k;
x >>= k;
j >>= 1;
}
return y;
}
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char *argv[])
{
printf("%d\n", ceil_log2(atol(argv[1])));
return 0;
}
回答by ergosys
If you can limit yourself to gcc, there are a set of builtin functions which return the number of leading zero bits and can be used to do what you want with a little work:
如果您可以限制自己使用 gcc,那么有一组内置函数可以返回前导零位的数量,并且可以通过一些工作来完成您想要的操作:
int __builtin_clz (unsigned int x)
int __builtin_clzl (unsigned long)
int __builtin_clzll (unsigned long long)
回答by Tom Sirgedas
If you're compiling for 64-bit processors on Windows, I think this should work. _BitScanReverse64 is an intrinsic function.
如果您在 Windows 上为 64 位处理器编译,我认为这应该可行。_BitScanReverse64 是一个内在函数。
#include <intrin.h>
__int64 log2ceil( __int64 x )
{
unsigned long index;
if ( !_BitScanReverse64( &index, x ) )
return -1LL; //dummy return value for x==0
// add 1 if x is NOT a power of 2 (to do the ceil)
return index + (x&(x-1)?1:0);
}
For 32-bit, you can emulate _BitScanReverse64, with 1 or 2 calls to _BitScanReverse. Check the upper 32-bits of x, ((long*)&x)[1], then the lower 32-bits if needed, ((long*)&x)[0].
对于 32 位,您可以模拟 _BitScanReverse64,调用 1 或 2 次 _BitScanReverse。检查 x 的高 32 位,((long*)&x)[1],然后根据需要检查低 32 位,((long*)&x)[0]。
回答by rwong
#include "stdafx.h"
#include "assert.h"
int getpos(unsigned __int64 value)
{
if (!value)
{
return -1; // no bits set
}
int pos = 0;
if (value & (value - 1ULL))
{
pos = 1;
}
if (value & 0xFFFFFFFF00000000ULL)
{
pos += 32;
value = value >> 32;
}
if (value & 0x00000000FFFF0000ULL)
{
pos += 16;
value = value >> 16;
}
if (value & 0x000000000000FF00ULL)
{
pos += 8;
value = value >> 8;
}
if (value & 0x00000000000000F0ULL)
{
pos += 4;
value = value >> 4;
}
if (value & 0x000000000000000CULL)
{
pos += 2;
value = value >> 2;
}
if (value & 0x0000000000000002ULL)
{
pos += 1;
value = value >> 1;
}
return pos;
}
int _tmain(int argc, _TCHAR* argv[])
{
assert(getpos(0ULL) == -1); // None bits set, return -1.
assert(getpos(1ULL) == 0);
assert(getpos(2ULL) == 1);
assert(getpos(3ULL) == 2);
assert(getpos(4ULL) == 2);
for (int k = 0; k < 64; ++k)
{
int pos = getpos(1ULL << k);
assert(pos == k);
}
for (int k = 0; k < 64; ++k)
{
int pos = getpos( (1ULL << k) - 1);
assert(pos == (k < 2 ? k - 1 : k) );
}
for (int k = 0; k < 64; ++k)
{
int pos = getpos( (1ULL << k) | 1);
assert(pos == (k < 1 ? k : k + 1) );
}
for (int k = 0; k < 64; ++k)
{
int pos = getpos( (1ULL << k) + 1);
assert(pos == k + 1);
}
return 0;
}
回答by Alexander Amelkin
Using the gcc builtins mentioned by @egosys you can build some useful macros. For a quick and rough floor(log2(x)) calculation you can use:
使用@egosys 提到的 gcc 内置函数,您可以构建一些有用的宏。对于快速粗略的地板(log2(x))计算,您可以使用:
#define FAST_LOG2(x) (sizeof(unsigned long)*8 - 1 - __builtin_clzl((unsigned long)(x)))
For a similar ceil(log2(x)), use:
对于类似的 ceil(log2(x)),使用:
#define FAST_LOG2_UP(x) (((x) - (1 << FAST_LOG2(x))) ? FAST_LOG2(x) + 1 : FAST_LOG2(x))
The latter can be further optimized using more gcc peculiarities to avoid the double call to the builtin, but I'm not sure you need it here.
后者可以使用更多 gcc 特性进一步优化以避免对内置函数的双重调用,但我不确定您在这里是否需要它。
回答by BeeOnRope
The fastest approach I'm aware of uses a fast log2that rounds down, combined unconditional adjustment of input value before and after to handle the rounding up case as in lg_down()shown below.
我所知道的最快方法是使用快速log2向下舍入,在处理舍入情况之前和之后组合无条件调整输入值,lg_down()如下所示。
/* base-2 logarithm, rounding down */
static inline uint64_t lg_down(uint64_t x) {
return 63U - __builtin_clzl(x);
}
/* base-2 logarithm, rounding up */
static inline uint64_t lg_up(uint64_t x) {
return lg_down(x - 1) + 1;
}
Basically adding 1 to the rounded-down result is already correct for all values except exact powers of two (since in that case the floorand ceilapproaches should return the same answer), so it is sufficient to subtract 1 from the input value to handle that case (it doesn't change the answer for the other cases) and add one to the result.
基本上,将 1 添加到舍入结果对于除 2 的精确幂之外的所有值已经是正确的(因为在这种情况下,floor和ceil方法应该返回相同的答案),因此从输入值中减去 1 就足以处理这种情况(它不会改变其他情况的答案)并将其添加到结果中。
This is usually slightly faster than the approaches that adjust the value by explicitly checking for exact powers of two (e.g., adding a !!(x & (x - 1))term). It avoids any comparisons and conditional operations or branches, is more likely to simply when inlining, is more amenable to vectorization, etc.
这通常比通过明确检查 2 的精确幂(例如,添加一项!!(x & (x - 1)))来调整值的方法略快。它避免了任何比较和条件操作或分支,更可能在内联时更简单,更适合矢量化等。
This relies on the "count leading bits" functionality offered by most CPUs using the clang/icc/gcc builtin __builtin_clzl, but other platforms offer something similar (e.g., the BitScanReverseintrinsic in Visual Studio).
这依赖于大多数 CPU 使用 clang/icc/gcc 内置功能提供的“计数前导位”功能__builtin_clzl,但其他平台提供了类似的功能(例如,BitScanReverseVisual Studio 中的内在功能)。
Unfortunately, this many return the wrong answer for log(1), since that leads to __builtin_clzl(0)which is undefined behavior based on the gcc documentation. Of course, the general "count leading zeros" function has perfectly defined behavior at zero, but the gcc builtin is defined in this way because prior to the BMI ISA extension on x86, it would have been using the bsrinstructionwhich itself has undefined behavior.
不幸的是,很多人返回了错误的答案log(1),因为这会导致__builtin_clzl(0)基于 gcc 文档的未定义行为。当然,一般的“计数前导零”函数在零处具有完美定义的行为,但是 gcc 内置函数以这种方式定义,因为在 x86 上的 BMI ISA 扩展之前,它会使用本身具有未定义行为的bsr指令。
You could work around this if you know you have the lzcntinstruction by using the lzcntintrinsic directly. Most platforms other than x86 never went through the bsrmistake in the first place, and probably also offer methods to access their "count leading zeros" instruction if they have one.
如果您知道直接lzcnt使用lzcnt内在指令有指令,则可以解决此问题。除了 x86 之外的大多数平台一开始都没有经历过这个bsr错误,并且可能还提供了访问“计数前导零”指令的方法(如果有的话)。
回答by kevinlawler
The following code snippet is a safe and portable way to extend plain-C methods, such as @dgobbi's, to use compiler intrinsics when compiled using supporting compilers (Clang). Placing this at the top of the method will cause the method to use the builtin when it is available. When the builtin is unavailable the method will fall back to the standard C code.
以下代码片段是一种安全且可移植的方法,用于扩展普通 C 方法(例如 @dgobbi 的方法)以在使用支持编译器 (Clang) 编译时使用编译器内部函数。将它放在方法的顶部将导致该方法在可用时使用内置函数。当内置函数不可用时,该方法将回退到标准 C 代码。
#ifndef __has_builtin
#define __has_builtin(x) 0
#endif
#if __has_builtin(__builtin_clzll) //use compiler if possible
return ((sizeof(unsigned long long) * 8 - 1) - __builtin_clzll(x)) + (!!(x & (x - 1)));
#endif
回答by Mahmoud Al-Qudsi
The truefastest solution:
在真正的最快的解决方案:
A binary search tree of 63 entries. These are the powers of 2 from 0 to 63. One-time generation function to create the tree. The leafs represent the log base 2 of the powers (basically, the numbers 1-63).
63 个条目的二叉搜索树。这些是从 0 到 63 的 2 的幂。用于创建树的一次性生成函数。叶子代表以 2 为底的幂的对数(基本上是数字 1-63)。
To find the answer, you feed a number into the tree, and navigate to the leaf node greater than the item. If the leaf node is exactly equal, result is the leaf value. Otherwise, result is the leaf value + 1.
要找到答案,请向树中输入一个数字,然后导航到大于该项目的叶节点。如果叶节点完全相等,则结果是叶值。否则,结果是叶值 + 1。
Complexity is fixed at O(6).
复杂度固定为 O(6)。
回答by Brian R. Bondy
Finding the log base 2 of an integer (64-bit or any other bit) with integer output is equivalent to finding the most significant bit that is set. Why? Because log base 2 is how many times you can divide the number by 2 to reach 1.
使用整数输出查找整数(64 位或任何其他位)的对数基数 2 等效于查找设置的最高有效位。为什么?因为对数基数 2 是您可以将数字除以 2 达到 1 的次数。
One way to find the MSB that's set is to simply bitshift to the right by 1 each time until you have 0. Another more efficient way is to do some kind of binary search via bitmasks.
找到设置的 MSB 的一种方法是每次简单地向右移位 1 直到得到 0。另一种更有效的方法是通过位掩码进行某种二进制搜索。
The ceil part is easily worked out by checking if any other bits are set other than the MSB.
通过检查除 MSB 之外是否设置了任何其他位,可以轻松计算出 ceil 部分。
回答by R.. GitHub STOP HELPING ICE
If you have 80-bit or 128-bit floats available, cast to that type and then read off the exponent bits. This link has details (for integers up to 52 bits) and several other methods:
如果您有 80 位或 128 位浮点数可用,请转换为该类型,然后读取指数位。此链接包含详细信息(最多 52 位的整数)和其他几种方法:
http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogIEEE64Float
http://graphics.stanford.edu/~seander/bithacks.html#IntegerLogIEEE64Float
Also, check the ffmpeg source. I know they have a very fast algorithm. Even if it's not directly extensible to larger sizes, you can easily do something like if (x>INT32_MAX) return fastlog2(x>>32)+32; else return fastlog2(x);
另外,检查 ffmpeg 源。我知道他们有一个非常快的算法。即使它不能直接扩展到更大的尺寸,你也可以轻松地做类似的事情if (x>INT32_MAX) return fastlog2(x>>32)+32; else return fastlog2(x);

