C++ 查找大于或等于给定值的 2 的最小幂的算法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/364985/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 14:57:55  来源:igfitidea点击:

Algorithm for finding the smallest power of two that's greater or equal to a given value

c++algorithmassembly

提问by Boyan

I need to find the smallest power of two that's greater or equal to a given value. So far, I have this:

我需要找到大于或等于给定值的 2 的最小幂。到目前为止,我有这个:

int value = 3221; // 3221 is just an example, could be any number
int result = 1;

while (result < value) result <<= 1;

It works fine, but feels kind of naive. Is there a better algorithm for that problem?

它工作正常,但感觉有点天真。有没有更好的算法来解决这个问题?

EDIT. There were some nice Assembler suggestions, so I'm adding those tags to the question.

编辑。有一些不错的汇编程序建议,所以我将这些标签添加到问题中。

回答by Larry Gritz

Here's my favorite. Other than the initial check for whether it's invalid (<0, which you could skip if you knew you'd only have >=0 numbers passed in), it has no loops or conditionals, and thus will outperform most other methods. This is similar to erickson's answer, but I think that my decrementing x at the beginning and adding 1 at the end is a little less awkward than his answer (and also avoids the conditional at the end).

这是我最喜欢的。除了对它是否无效的初始检查(<0,如果你知道你只传入了 >=0 的数字,你可以跳过它),它没有循环或条件,因此将优于大多数其他方法。这与 erickson 的答案相似,但我认为我在开始时递减 x 并在最后加 1 比他的答案更不尴尬(并且还避免了最后的条件)。

/// Round up to next higher power of 2 (return x if it's already a power
/// of 2).
inline int
pow2roundup (int x)
{
    if (x < 0)
        return 0;
    --x;
    x |= x >> 1;
    x |= x >> 2;
    x |= x >> 4;
    x |= x >> 8;
    x |= x >> 16;
    return x+1;
}

回答by jfs

ceil(log2(value))

ilog2()can be calculated in 3 asm instructions e.g., http://www.asterisk.org/doxygen/1.4/log2comp_8h-source.html

ilog2()可以在 3 个 asm 指令中计算,例如http://www.asterisk.org/doxygen/1.4/log2comp_8h-source.html

回答by Tony Lee

In the spirit of Quake II's 0x5f3759df and the Bit Twiddling Hacks' IEEE version - this solution reaches into a double to extract the exponent as a means to calculate floor(lg2(n)). It's a bit faster than the accepted solution and much faster than the Bit Twiddling IEEE version since it avoids floating point math. As coded, it assumes a double is a real*8 IEEE float on a little endian machine.

本着 Quake II 的 0x5f3759df 和 Bit Twiddling Hacks 的 IEEE 版本的精神 - 该解决方案达到了双倍以提取指数作为计算 floor(lg2(n)) 的一种手段。它比公认的解决方案快一点,比 Bit Twiddling IEEE 版本快得多,因为它避免了浮点数学。按照编码,它假设 double 是小端机器上的真实 *8 IEEE 浮点数。

int nextPow2(int n) 
{ 
    if ( n <= 1 ) return n;
    double d = n-1; 
    return 1 << ((((int*)&d)[1]>>20)-1022); 
} 

Edit: Add optimized x86 assembly version with the help of a co-worker. A 4% speed gain but still about 50% slower than a bsr version (6 sec vs 4 on my laptop for n=1..2^31-2).

编辑:在同事的帮助下添加优化的 x86 程序集版本。速度提高了 4%,但仍比 bsr 版本慢约 50%(n=1..2^31-2 时,我的笔记本电脑上为 6 秒 vs 4)。

int nextPow2(int n) 
{ 
    if ( n <= 1 ) return n;
    double d;
    n--;
    __asm {
      fild    n 
      mov     eax,4
      fstp    d 
      mov     ecx, dword ptr d[eax]
      sar     ecx,14h 
      rol     eax,cl 
  }
} 

回答by pngaz

On Intel hardware the BSR instruction is close to what you want - it finds the most-significant-set-bit. If you need to be more precise you can then wonder if the remaining bits are precisely zero or not. I tend to assume that other CPU's will have something like BSR - this is a question you want answered to normalize a number. If your number is more than 32 bits then you would do a scan from your most-significant-DWORD to find the first DWORD with ANYbits set. Edsger Dijkstra would likely remark that the above "algorithms" assume that your computer uses Binary Digits, while from his kind of lofty "algorithmic" perspective you should think about Turing machines or something - obviously I am of the more pragmatic style.

在 Intel 硬件上,BSR 指令接近您想要的 - 它找到最重要的设置位。如果您需要更精确,则可以想知道其余位是否正好为零。我倾向于假设其他 CPU 会有类似 BSR 的东西——这是一个你想要回答的问题来规范化一个数字。如果您的数字超过 32 位,那么您将从最重要的 DWORD 进行扫描,以找到设置了任何位的第一个 DWORD 。Edsger Dijkstra 可能会说,上述“算法”假设您的计算机使用二进制数字,而从他那种崇高的“算法”角度来看,您应该考虑图灵机或其他东西——显然我是更务实的风格。

回答by paxdiablo

Your implementation is not naive, it's actually the logical one, except that it's wrong - it returns a negative for numbers greater that 1/2 the maximum integer size.

您的实现并不幼稚,它实际上是合乎逻辑的,只是它是错误的 - 对于大于最大整数大小的 1/2 的数字,它返回负数。

Assuming you can restrict numbers to the range 0 through 2^30 (for 32-bit ints), it'll work just fine, and a lot faster than any mathematical functions involving logarithms.

假设您可以将数字限制在 0 到 2^30 的范围内(对于 32 位整数),它会工作得很好,并且比任何涉及对数的数学函数都要快得多。

Unsigned ints would work better but you'd end up with an infinite loop (for numbers greater than 2^31) since you can never reach 2^32 with the << operator.

无符号整数会更好地工作,但最终会出现无限循环(对于大于 2^31 的数字),因为使用 << 运算符永远无法达到 2^32。

回答by Sorana

pow ( 2 , ceil( log2(value) );

pow ( 2 , ceil( log2(value) );

log2(value) = log(value) / log(2);

log2(value) = log(value) / log(2);

回答by Zacrath

Here's a template version of the bit shifting technique.

这是位移位技术的模板版本。

template<typename T> T next_power2(T value)
{
    --value;
    for(size_t i = 1; i < sizeof(T) * CHAR_BIT; i*=2)
        value |= value >> i;
    return value+1;
}

Since the loop only uses constants it gets flattened by the compiler. (I checked) The function is also future proof.

由于循环只使用常量,它会被编译器扁平化。(我检查过)该功能也是面向未来的。

Here's one that uses __builtin_clz. (Also future proof)

这是使用 __builtin_clz 的一个。(也是未来证明)

template<typename T> T next_power2(T value)
{
    return 1 << ((sizeof(T) * CHAR_BIT) - __builtin_clz(value-1));
}

回答by Sparr

An exploration of the possible solutions to closely related problem (that is, rounding down instead of up), many of which are significantly faster than the naive approach, is available on the Bit Twiddling Hackspage, an excellent resource for doing the kinds of optimization you are looking for. The fastest solution is to use a lookup table with 256 entries, that reduces the total operation count to around 7, from an average of 62 (by a similar operation counting methodology) for the naive approach. Adapting those solutions to your problem is a matter of a single comparison and increment.

Bit Twiddling Hacks页面上提供了对密切相关问题(即向下舍入而不是向上舍入)的可能解决方案的探索,其中许多解决方案比天真的方法快得多,这是进行各种优化的绝佳资源你正在寻找。最快的解决方案是使用具有 256 个条目的查找表,这将总操作计数从朴素方法的平均 62(通过类似的操作计数方法)减少到大约 7。使这些解决方案适应您的问题是一个单一的比较和增量问题。

回答by Dipstick

You don't really say what you mean by "better algorithm" but as the one you present is perfectly clear (if somewhat flawed), I'll assume you are after a more efficient algorithm.

您并没有真正说出“更好的算法”是什么意思,但是由于您提出的算法非常清楚(如果有些缺陷),我假设您追求的是更有效的算法。

Larry Gritz has given what is probably the most efficient c/c++ algorithm without the overhead of a look up table and it would suffice in most cases (see http://www.hackersdelight.orgfor similar algorithms).

Larry Gritz 给出了可能是最有效的 c/c++ 算法,没有查找表的开销,并且在大多数情况下就足够了(有关类似算法,请参见http://www.hackersdelight.org)。

As mentioned elsewhere most CPUs these days have machine instructions to count the number of leading zeroes (or equivalently return the ms set bit) however their use is non-portable and - in most cases - not worth the effort.

正如其他地方所提到的,现在大多数 CPU 都有机器指令来计算前导零的数量(或等效地返回 ms 设置位),但是它们的使用是不可移植的,并且 - 在大多数情况下 - 不值得付出努力。

However most compilers have "intrinsic" functions that allow the use of machine instructions but in a more portable way.

然而,大多数编译器具有允许使用机器指令但以更便携的方式使用的“内在”函数。

Microsoft C++ has _BitScanReverse() and gcc provides __builtin_clz() to do the bulk of the work efficiently.

Microsoft C++ 有 _BitScanReverse() 并且 gcc 提供 __builtin_clz() 来高效地完成大部分工作。

回答by duncan.forster

How about a recursive template version to generate a compile constant:

递归模板版本如何生成编译常量:

template<uint32_t A, uint8_t B = 16>
struct Pow2RoundDown { enum{ value = Pow2RoundDown<(A | (A >> B)), B/2>::value }; };
template<uint32_t A>
struct Pow2RoundDown<A, 1> { enum{ value = (A | (A >> 1)) - ((A | (A >> 1)) >> 1) }; };

template<uint32_t A, uint8_t B = 16>
struct Pow2RoundUp { enum{ value = Pow2RoundUp<((B == 16 ? (A-1) : A) | ((B == 16 ? (A-1) : A) >> B)), B/2>::value }; };
template<uint32_t A >
struct Pow2RoundUp<A, 1> { enum{ value = ((A | (A >> 1)) + 1) }; };

Can be used like so:

可以这样使用:

Pow2RoundDown<3221>::value, Pow2RoundUp<3221>::value