在 C++ 中计算 10 的整数幂有什么比 pow() 更快的方法吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18581560/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 22:02:56  来源:igfitidea点击:

Any way faster than pow() to compute an integer power of 10 in C++?

c++numerical

提问by szli

I know power of 2 can be implemented using << operator. What about power of 10? Like 10^5? Is there any way faster than pow(10,5) in C++? It is a pretty straight-forward computation by hand. But seems not easy for computers due to binary representation of the numbers... Let us assume I am only interested in integer powers, 10^n, where n is an integer.

我知道可以使用 << 运算符实现 2 的幂。10的幂呢?像 10^5?有没有比 C++ 中的 pow(10,5) 更快的方法?这是一个非常直接的手工计算。但是由于数字的二进制表示,对于计算机来说似乎并不容易......让我们假设我只对整数幂感兴趣,10^n,其中 n 是一个整数。

采纳答案by Mats Petersson

Something like this:

像这样的东西:

int quick_pow10(int n)
{
    static int pow10[10] = {
        1, 10, 100, 1000, 10000, 
        100000, 1000000, 10000000, 100000000, 1000000000
    };

    return pow10[n]; 
}

Obviously, can do the same thing for long long.

显然,可以对long long.

This should be several times faster than any competing method. However, it is quite limited if you have lots of bases (although the number of values goes down quite dramatically with larger bases), so if there isn't a huge number of combinations, it's still doable.

这应该比任何竞争方法快几倍。然而,如果你有很多基数,它是非常有限的(尽管值的数量随着基数的增加而显着下降),所以如果没有大量的组合,它仍然是可行的。

As a comparison:

作为对比:

#include <iostream>
#include <cstdlib>
#include <cmath>

static int quick_pow10(int n)
{
    static int pow10[10] = {
        1, 10, 100, 1000, 10000, 
        100000, 1000000, 10000000, 100000000, 1000000000
    };

    return pow10[n]; 
}

static int integer_pow(int x, int n)
{
    int r = 1;
    while (n--)
       r *= x;

    return r; 
}

static int opt_int_pow(int n)
{
    int r = 1;
    const int x = 10;
    while (n)
    {
        if (n & 1) 
        {
           r *= x;
           n--;
        }
        else
        {
            r *= x * x;
            n -= 2;
        }
    }

    return r; 
}


int main(int argc, char **argv)
{
    long long sum = 0;
    int n = strtol(argv[1], 0, 0);
    const long outer_loops = 1000000000;

    if (argv[2][0] == 'a')
    {
        for(long i = 0; i < outer_loops / n; i++)
        {
            for(int j = 1; j < n+1; j++)
            {
                sum += quick_pow10(n);
            }
        }
    }
    if (argv[2][0] == 'b')
    {
        for(long i = 0; i < outer_loops / n; i++)
        {
            for(int j = 1; j < n+1; j++)
            {
                sum += integer_pow(10,n);
            }
        }
    }

    if (argv[2][0] == 'c')
    {
        for(long i = 0; i < outer_loops / n; i++)
        {
            for(int j = 1; j < n+1; j++)
            {
                sum += opt_int_pow(n);
            }
        }
    }

    std::cout << "sum=" << sum << std::endl;
    return 0;
}

Compiled with g++ 4.6.3, using -Wall -O2 -std=c++0x, gives the following results:

用 g++ 4.6.3 编译,使用-Wall -O2 -std=c++0x,得到如下结果:

$ g++ -Wall -O2 -std=c++0x pow.cpp
$ time ./a.out 8 a
sum=100000000000000000

real    0m0.124s
user    0m0.119s
sys 0m0.004s
$ time ./a.out 8 b
sum=100000000000000000

real    0m7.502s
user    0m7.482s
sys 0m0.003s

$ time ./a.out 8 c
sum=100000000000000000

real    0m6.098s
user    0m6.077s
sys 0m0.002s

(I did have an option for using powas well, but it took 1m22.56s when I first tried it, so I removed it when I decided to have optimised loop variant)

(我也有一个使用选项pow,但我第一次尝试时花了 1m22.56s,所以当我决定优化循环变体时我将其删除)

回答by Dietmar Kühl

There are certainly ways to compute integral powers of 10 faster than using std::pow()! The first realization is that pow(x, n)can be implemented in O(log n) time. The next realization is that pow(x, 10)is the same as (x << 3) * (x << 1). Of course, the compiler knows the latter, i.e., when you are multiplying an integer by the integer constant 10, the compiler will do whatever is fastest to multiply by 10. Based on these two rules it is easy to create fast computations, even if xis a big integer type.

肯定有比使用std::pow()!更快地计算 10 的整数幂的方法 第一个实现是pow(x, n)可以在 O(log n) 时间内实现。下一个实现pow(x, 10)(x << 3) * (x << 1). 当然,编译器知道后者,即当你将一个整数乘以整数常量 10 时,编译器会做任何最快的乘以 10 的事情。基于这两条规则,很容易创建快速计算,即使x是一个大整数类型。

In case you are interested in games like this:

如果你对这样的游戏感兴趣:

  1. A generic O(log n) version of power is discussed in Elements of Programming.
  2. Lots of interesting "tricks" with integers are discussed in Hacker's Delight.
  1. 编程元素中讨论了通用 O(log n) 版本的幂。
  2. Hacker's Delight中讨论了许多有趣的整数“技巧” 。

回答by Vincent

A solution for any base using template meta-programming :

使用模板元编程的任何基础的解决方案:

template<int E, int N>
struct pow {
    enum { value = E * pow<E, N - 1>::value };
};

template <int E>
struct pow<E, 0> {
    enum { value = 1 };
};

Then it can be used to generate a lookup-table that can be used at runtime :

然后它可以用来生成一个可以在运行时使用的查找表:

template<int E>
long long quick_pow(unsigned int n) {
    static long long lookupTable[] = {
        pow<E, 0>::value, pow<E, 1>::value, pow<E, 2>::value,
        pow<E, 3>::value, pow<E, 4>::value, pow<E, 5>::value,
        pow<E, 6>::value, pow<E, 7>::value, pow<E, 8>::value,
        pow<E, 9>::value
    };

    return lookupTable[n];
}

This must be used with correct compiler flags in order to detect the possible overflows.

这必须与正确的编译器标志一起使用,以便检测可能的溢出。

Usage example :

用法示例:

for(unsigned int n = 0; n < 10; ++n) {
    std::cout << quick_pow<10>(n) << std::endl;
}

回答by Vincent

An integer power function (which doesn't involve floating-point conversions and computations) may very well be faster than pow():

整数幂函数(不涉及浮点转换和计算)很可能比pow()

int integer_pow(int x, int n)
{
    int r = 1;
    while (n--)
        r *= x;

    return r; 
}

Edit:benchmarked - the naive integer exponentiation method seems to outperform the floating-point one by about a factor of two:

编辑:基准测试 - 朴素的整数取幂方法似乎比浮点数高出大约两倍:

h2co3-macbook:~ h2co3$ cat quirk.c
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <errno.h>
#include <string.h>
#include <math.h>

int integer_pow(int x, int n)
{
    int r = 1;
    while (n--)
    r *= x;

    return r; 
}

int main(int argc, char *argv[])
{
    int x = 0;

    for (int i = 0; i < 100000000; i++) {
        x += powerfunc(i, 5);
    }

    printf("x = %d\n", x);

    return 0;
}
h2co3-macbook:~ h2co3$ clang -Wall -o quirk quirk.c -Dpowerfunc=integer_pow
h2co3-macbook:~ h2co3$ time ./quirk
x = -1945812992

real    0m1.169s
user    0m1.164s
sys 0m0.003s
h2co3-macbook:~ h2co3$ clang -Wall -o quirk quirk.c -Dpowerfunc=pow
h2co3-macbook:~ h2co3$ time ./quirk
x = -2147483648

real    0m2.898s
user    0m2.891s
sys 0m0.004s
h2co3-macbook:~ h2co3$ 

回答by Yakk - Adam Nevraumont

Here is a stab at it:

这是一个尝试:

// specialize if you have a bignum integer like type you want to work with:
template<typename T> struct is_integer_like:std::is_integral<T> {};
template<typename T> struct make_unsigned_like:std::make_unsigned<T> {};

template<typename T, typename U>
T powT( T base, U exponent ) {
  static_assert( is_integer_like<U>::value, "exponent must be integer-like" );
  static_assert( std::is_same< U, typename make_unsigned_like<U>::type >::value, "exponent must be unsigned" );

  T retval = 1;
  T& multiplicand = base;
  if (exponent) {
    while (true) {
      // branch prediction will be awful here, you may have to micro-optimize:
      retval *= (exponent&1)?multiplicand:1;
      // or /2, whatever -- `>>1` is probably faster, esp for bignums:
      exponent = exponent>>1;
      if (!exponent)
        break;
      multiplicand *= multiplicand;
    }
  }
  return retval;
}

What is going on above is a few things.

上面发生的是一些事情。

First, so BigNum support is cheap, it is templateized. Out of the box, it supports any base type that supports *= own_typeand either can be implicitly converted to int, or intcan be implicitly converted to it (if both is true, problems will occur), and you need to specialize some templates to indicate that the exponent type involved is both unsigned and integer-like.

首先,所以 BigNum 支持很便宜,它是templateized。开箱即用,它支持任何支持的基类型,*= own_type可以隐式转换为intint也可以隐式转换为它(如果两者都为真,则会出现问题),并且需要特化一些templates 来表示指数涉及的类型既是无符号的又是类似整数的。

In this case, integer-like and unsigned means that it supports &1returning booland >>1returning something it can be constructed from and eventually (after repeated >>1s) reaches a point where evaluating it in a boolcontext returns false. I used traits classes to express the restriction, because naive use by a value like -1would compile and (on some platforms) loop forever, while (on others) would not.

在这种情况下,integer-like 和 unsigned 意味着它支持&1返回bool>>1返回它可以构造的东西,并最终(在重复>>1s 之后)到达在bool上下文中对其进行评估返回的点false。我使用了traits 类来表达限制,因为像-1这样的值的天真使用会编译并且(在某些平台上)永远循环,而(在其他平台上)则不会。

Execution time for this algorithm, assuming multiplication is O(1), is O(lg(exponent)), where lg(exponent) is the number of times it takes to <<1the exponentbefore it evaluates as falsein a boolean context. For traditional integer types, this would be the binary log of the exponents value: so no more than 32.

执行时间,该算法,假设乘法是O(1),是O(LG(指数)),其中LG(指数)是它需要的次数<<1exponent其评估为之前falseboolEAN上下文。对于传统的整数类型,这将是exponents 值的二进制日志:所以不超过 32。

I also eliminated all branches within the loop (or, made it obvious to existing compilers that no branch is needed, more precisely), with just the control branch (which is true uniformly until it is false once). Possibly eliminating even that branch might be worth it for high bases and low exponents...

我还消除了循环中的所有分支(或者,让现有编译器明显不需要分支,更准确地说),只使用控制分支(在一次为假之前一致为真)。对于高基数和低指数,即使消除那个分支也可能是值得的......

回答by acegs

No multiplication and no table version:

没有乘法和没有表格版本:

//Nx10^n
int Npow10(int N, int n){
  N <<= n;
  while(n--) N += N << 2;
  return N;
}

回答by Marko Gregurovic

This function will calculate x ^ y much faster then pow. In case of integer values.

这个函数会比 pow 更快地计算 x ^ y。在整数值的情况下。

int pot(int x, int y){
int solution = 1;
while(y){
    if(y&1)
        solution*= x;
    x *= x;
    y >>= 1;
}
return solution;

}

}

回答by Rahul Tripathi

You can use the lookup table which will be by far the fastest

您可以使用迄今为止最快的查找表

You can also consider using this:-

你也可以考虑使用这个:-

template <typename T>
T expt(T p, unsigned q)
{
    T r(1);

    while (q != 0) {
        if (q % 2 == 1) {    // q is odd
            r *= p;
            q--;
        }
        p *= p;
        q /= 2;
    }

    return r;
}

回答by Nine

Based on Mats Peterssonapproach, but compile time generation of cache.

基于Mats Petersson方法,但编译时生成缓存。

#include <iostream>
#include <limits>
#include <array>

// digits

template <typename T>
constexpr T digits(T number) {    
  return number == 0 ? 0 
                     : 1 + digits<T>(number / 10);
}

// pow

// https://stackoverflow.com/questions/24656212/why-does-gcc-complain-error-type-intt-of-template-argument-0-depends-on-a
// unfortunatly we can't write `template <typename T, T N>` because of partial specialization `PowerOfTen<T, 1>`

template <typename T, uintmax_t N>
struct PowerOfTen {
  enum { value = 10 * PowerOfTen<T, N - 1>::value };
};

template <typename T>
struct PowerOfTen<T, 1> {
  enum { value = 1 };
};

// sequence

template<typename T, T...>
struct pow10_sequence { };

template<typename T, T From, T N, T... Is>
struct make_pow10_sequence_from 
: make_pow10_sequence_from<T, From, N - 1, N - 1, Is...> { 
  //  
};

template<typename T, T From, T... Is>
struct make_pow10_sequence_from<T, From, From, Is...> 
: pow10_sequence<T, Is...> { 
  //
};

// base10list

template <typename T, T N, T... Is>
constexpr std::array<T, N> base10list(pow10_sequence<T, Is...>) {
  return {{ PowerOfTen<T, Is>::value... }};
}

template <typename T, T N>
constexpr std::array<T, N> base10list() {    
  return base10list<T, N>(make_pow10_sequence_from<T, 1, N+1>());
}

template <typename T>
constexpr std::array<T, digits(std::numeric_limits<T>::max())> base10list() {    
  return base10list<T, digits(std::numeric_limits<T>::max())>();    
};

// main pow function

template <typename T>
static T template_quick_pow10(T n) {

  static auto values = base10list<T>();
  return values[n]; 
}

// client code

int main(int argc, char **argv) {

  long long sum = 0;
  int n = strtol(argv[1], 0, 0);
  const long outer_loops = 1000000000;

  if (argv[2][0] == 't') {

    for(long i = 0; i < outer_loops / n; i++) {

      for(int j = 1; j < n+1; j++) {

        sum += template_quick_pow10(n);
      }
    }
  }

  std::cout << "sum=" << sum << std::endl;
  return 0;
}

Code does not contain quick_pow10, integer_pow, opt_int_pow for better readability, but tests done with them in the code.

为了更好的可读性,代码不包含 quick_pow10、integer_pow、opt_int_pow,但在代码中用它们完成了测试。

Compiled with gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5), using -Wall -O2 -std=c++0x, gives the following results:

用 gcc 4.6.3 版(Ubuntu/Linaro 4.6.3-1ubuntu5)编译,使用 -Wall -O2 -std=c++0x,得到如下结果:

$ g++ -Wall -O2 -std=c++0x main.cpp

$ time ./a.out  8 a
sum=100000000000000000

real  0m0.438s
user  0m0.432s
sys 0m0.008s

$ time ./a.out  8 b
sum=100000000000000000

real  0m8.783s
user  0m8.777s
sys 0m0.004s

$ time ./a.out  8 c
sum=100000000000000000

real  0m6.708s
user  0m6.700s
sys 0m0.004s

$ time ./a.out  8 t
sum=100000000000000000

real  0m0.439s
user  0m0.436s
sys 0m0.000s

回答by Jo?o Paulo

Now, with constexpr, you can do like so:

现在,使用constexpr,您可以这样做:

constexpr int pow10(int n) {
    int result = 1;
    for (int i = 1; i<=n; ++i)
        result *= 10;
    return result;
}

int main () {
    int i = pow10(5);
}

iwill be calculated at compile time. ASM generated for x86-64 gcc 9.2:

i将在编译时计算。为 x86-64 gcc 9.2 生成的 ASM:

main:
        push    rbp
        mov     rbp, rsp
        mov     DWORD PTR [rbp-4], 100000
        mov     eax, 0
        pop     rbp
        ret