performance 获得 π 值的最快方法是什么？

Question

提问by Chris Jester-Young

I'm looking for the fastest way to obtain the value of π, as a personal challenge. More specifically, I'm using ways that don't involve using #defineconstants like M_PI, or hard-coding the number in.

我正在寻找获得 π 值的最快方法，作为个人挑战。更具体地说，我使用的方法不涉及使用#define常量（如M_PI）或对数字进行硬编码。

The program below tests the various ways I know of. The inline assembly version is, in theory, the fastest option, though clearly not portable. I've included it as a baseline to compare against the other versions. In my tests, with built-ins, the 4 * atan(1)version is fastest on GCC 4.2, because it auto-folds the atan(1)into a constant. With -fno-builtinspecified, the atan2(0, -1)version is fastest.

下面的程序测试了我所知道的各种方法。理论上，内联汇编版本是最快的选择，但显然不可移植。我已将其作为基线与其他版本进行比较。在我的测试中，使用内置插件，该4 * atan(1)版本在 GCC 4.2 上速度最快，因为它会自动将折叠atan(1)成一个常量。有-fno-builtin指定，atan2(0, -1)版本最快。

Here's the main testing program (pitimes.c):

这是主要的测试程序（pitimes.c）：

#include <math.h>
#include <stdio.h>
#include <time.h>

#define ITERS 10000000
#define TESTWITH(x) {                                                       \
    diff = 0.0;                                                             \
    time1 = clock();                                                        \
    for (i = 0; i < ITERS; ++i)                                             \
        diff += (x) - M_PI;                                                 \
    time2 = clock();                                                        \
    printf("%s\t=> %e, time => %f\n", #x, diff, diffclock(time2, time1));   \
}

static inline double
diffclock(clock_t time1, clock_t time0)
{
    return (double) (time1 - time0) / CLOCKS_PER_SEC;
}

int
main()
{
    int i;
    clock_t time1, time2;
    double diff;

    /* Warmup. The atan2 case catches GCC's atan folding (which would
     * optimise the ``4 * atan(1) - M_PI'' to a no-op), if -fno-builtin
     * is not used. */
    TESTWITH(4 * atan(1))
    TESTWITH(4 * atan2(1, 1))

#if defined(__GNUC__) && (defined(__i386__) || defined(__amd64__))
    extern double fldpi();
    TESTWITH(fldpi())
#endif

    /* Actual tests start here. */
    TESTWITH(atan2(0, -1))
    TESTWITH(acos(-1))
    TESTWITH(2 * asin(1))
    TESTWITH(4 * atan2(1, 1))
    TESTWITH(4 * atan(1))

    return 0;
}

And the inline assembly stuff (fldpi.c) that will only work for x86 and x64 systems:

以及fldpi.c仅适用于 x86 和 x64 系统的内联汇编内容 ( )：

double
fldpi()
{
    double pi;
    asm("fldpi" : "=t" (pi));
    return pi;
}

And a build script that builds all the configurations I'm testing (build.sh):

以及构建我正在测试的所有配置的构建脚本 ( build.sh)：

#!/bin/sh
gcc -O3 -Wall -c           -m32 -o fldpi-32.o fldpi.c
gcc -O3 -Wall -c           -m64 -o fldpi-64.o fldpi.c

gcc -O3 -Wall -ffast-math  -m32 -o pitimes1-32 pitimes.c fldpi-32.o
gcc -O3 -Wall              -m32 -o pitimes2-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -fno-builtin -m32 -o pitimes3-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -ffast-math  -m64 -o pitimes1-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall              -m64 -o pitimes2-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall -fno-builtin -m64 -o pitimes3-64 pitimes.c fldpi-64.o -lm

Apart from testing between various compiler flags (I've compared 32-bit against 64-bit too because the optimizations are different), I've also tried switching the order of the tests around. But still, the atan2(0, -1)version still comes out on top every time.

除了在各种编译器标志之间进行测试（我也比较了 32 位和 64 位，因为优化不同），我还尝试切换测试的顺序。但是，该atan2(0, -1)版本仍然每次都名列前茅。

Answer 1

采纳答案by nlucaroni

The Monte Carlo method, as mentioned, applies some great concepts but it is, clearly, not the fastest, not by a long shot, not by any reasonable measure. Also, it all depends on what kind of accuracy you are looking for. The fastest π I know of is the one with the digits hard coded. Looking at Piand Pi[PDF], there are a lot of formulae.

在蒙特卡罗方法，如前所述，适用于一些伟大的概念，但它是，显然，不是最快的，而不是由一个长镜头，没有任何合理的措施。此外，这一切都取决于您正在寻找什么样的准确性。我所知道的最快的 π 是数字硬编码的那个。看Pi和Pi[PDF]，有很多公式。

Here is a method that converges quickly — about 14 digits per iteration. PiFast, the current fastest application, uses this formula with the FFT. I'll just write the formula, since the code is straightforward. This formula was almost found by Ramanujan and discovered by Chudnovsky. It is actually how he calculated several billion digits of the number — so it isn't a method to disregard. The formula will overflow quickly and, since we are dividing factorials, it would be advantageous then to delay such calculations to remove terms.

这是一种快速收敛的方法——每次迭代大约 14 位数字。PiFast是当前最快的应用程序，使用此公式和 FFT。我只会写公式，因为代码很简单。这个公式几乎被拉马努金发现，被楚德诺夫斯基发现。这实际上是他计算出的数十亿位数字的方式——所以这不是一种可以无视的方法。该公式将很快溢出，并且由于我们正在除法阶乘，因此延迟此类计算以删除项将是有利的。

enter image description here

在此处输入图片说明

enter image description here

在此处输入图片说明

where,

在哪里，

enter image description here

在此处输入图片说明

Below is the Brent–Salamin algorithm. Wikipedia mentions that when aand bare "close enough" then (a + b)2 / 4twill be an approximation of π. I'm not sure what "close enough" means, but from my tests, one iteration got 2 digits, two got 7, and three had 15, of course this is with doubles, so it might have an error based on its representation and the truecalculation could be more accurate.

下面是Brent-Salamin 算法。维基百科提到，当a和b“足够接近”时，(a + b)2 / 4t将是 π 的近似值。我不确定“足够接近”是什么意思，但从我的测试来看，一次迭代得到 2 位数，两次得到 7，三次得到 15，当然这是双打，所以根据它的表示它可能有错误和在真正的计算可以更准确。

let pi_2 iters =
    let rec loop_ a b t p i =
        if i = 0 then a,b,t,p
        else
            let a_n = (a +. b) /. 2.0 
            and b_n = sqrt (a*.b)
            and p_n = 2.0 *. p in
            let t_n = t -. (p *. (a -. a_n) *. (a -. a_n)) in
            loop_ a_n b_n t_n p_n (i - 1)
    in 
    let a,b,t,p = loop_ (1.0) (1.0 /. (sqrt 2.0)) (1.0/.4.0) (1.0) iters in
    (a +. b) *. (a +. b) /. (4.0 *. t)

Lastly, how about some pi golf (800 digits)? 160 characters!

最后，来点圆周率高尔夫（800 位数）怎么样？160个字！

int a=10000,b,c=2800,d,e,f[2801],g;main(){for(;b-c;)f[b++]=a/5;for(;d=0,g=c*2;c-=14,printf("%.4d",e+d/a),e=d%a)for(b=c;d+=f[b]*a,f[b]=d%--g,d/=g--,--b;d*=b);}

Answer 2

回答by Pat

I really like this program, because it approximates π by looking at its own area.

我真的很喜欢这个程序，因为它通过查看自己的面积来近似 π。

IOCCC 1988 : westley.c

#define _ -F<00||--F-OO--;
int F=00,OO=00;main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO()
{
            _-_-_-_
       _-_-_-_-_-_-_-_-_
    _-_-_-_-_-_-_-_-_-_-_-_
  _-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
  _-_-_-_-_-_-_-_-_-_-_-_-_-_
    _-_-_-_-_-_-_-_-_-_-_-_
        _-_-_-_-_-_-_-_
            _-_-_-_
}

#define _ -F<00||--F-OO--;
int F=00,OO=00;main(){F_OO();printf("%1.3f\n",4.*-F/OO/OO);}F_OO()
{
            _-_-_-_
       _-_-_-_-_-_-_-_-_
    _-_-_-_-_-_-_-_-_-_-_-_
  _-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
 _-_-_-_-_-_-_-_-_-_-_-_-_-_-_
  _-_-_-_-_-_-_-_-_-_-_-_-_-_
    _-_-_-_-_-_-_-_-_-_-_-_
        _-_-_-_-_-_-_-_
            _-_-_-_
}

Answer 3

回答by Leon Bambrick

Here's a general description of a technique for calculating pi that I learnt in high school.

这是我在高中学到的计算圆周率的技术的一般描述。

I only share this because I think it is simple enough that anyone can remember it, indefinitely, plus it teaches you the concept of "Monte-Carlo" methods -- which are statistical methods of arriving at answers that don't immediately appear to be deducible through random processes.

我只是分享这个，因为我认为它很简单，任何人都可以无限期地记住它，而且它教会了你“蒙特卡罗”方法的概念——这是一种统计方法，可以得出不会立即出现的答案可以通过随机过程推导出来。

Draw a square, and inscribe a quadrant (one quarter of a semi-circle) inside that square (a quadrant with radius equal to the side of the square, so it fills as much of the square as possible)

画一个正方形，并在该正方形内刻出一个象限（半圆的四分之一）（半径等于正方形边长的象限，因此它尽可能多地填充正方形）

Now throw a dart at the square, and record where it lands -- that is, choose a random point anywhere inside the square. Of course, it landed inside the square, but is it inside the semi-circle? Record this fact.

现在向方格投掷飞镖，并记录它落在哪里——也就是说，在方格内的任何地方随机选择一个点。当然，它降落在正方形内，但它是在半圆内吗？记录这个事实。

Repeat this process many times -- and you will find there is a ratio of the number of points inside the semi-circle versus the total number thrown, call this ratio x.

多次重复这个过程——你会发现半圆内的点数与抛出的总数之比，称这个比率为x。

Since the area of the square is r times r, you can deduce that the area of the semi circle is x times r times r (that is, x times r squared). Hence x times 4 will give you pi.

由于正方形的面积是r乘r，所以可以推导出半圆的面积是x乘r乘r（即x乘r的平方）。因此 x 乘以 4 会给你 pi。

This is not a quick method to use. But it's a nice example of a Monte Carlo method. And if you look around, you may find that many problems otherwise outside your computational skills can be solved by such methods.

这不是一种快速的使用方法。但它是蒙特卡罗方法的一个很好的例子。如果您环顾四周，您可能会发现许多超出您计算技能的问题都可以通过此类方法解决。

Answer 4

回答by jon-hanson

In the interests of completeness, a C++ template version, which, for an optimised build, will compute an approximation of PI at compile time, and will inline to a single value.

出于完整性考虑，C++ 模板版本，对于优化构建，将在编译时计算 PI 的近似值，并将内联到单个值。

#include <iostream>

template<int I>
struct sign
{
    enum {value = (I % 2) == 0 ? 1 : -1};
};

template<int I, int J>
struct pi_calc
{
    inline static double value ()
    {
        return (pi_calc<I-1, J>::value () + pi_calc<I-1, J+1>::value ()) / 2.0;
    }
};

template<int J>
struct pi_calc<0, J>
{
    inline static double value ()
    {
        return (sign<J>::value * 4.0) / (2.0 * J + 1.0) + pi_calc<0, J-1>::value ();
    }
};


template<>
struct pi_calc<0, 0>
{
    inline static double value ()
    {
        return 4.0;
    }
};

template<int I>
struct pi
{
    inline static double value ()
    {
        return pi_calc<I, I>::value ();
    }
};

int main ()
{
    std::cout.precision (12);

    const double pi_value = pi<10>::value ();

    std::cout << "pi ~ " << pi_value << std::endl;

    return 0;
}

Note for I > 10, optimised builds can be slow, likewise for non-optimised runs. For 12 iterations I believe there are around 80k calls to value() (in the absence of memoisation).

请注意，对于 I > 10，优化构建可能会很慢，对于非优化运行也是如此。对于 12 次迭代，我相信大约有 80k 次调用 value()（在没有记忆的情况下）。

Answer 5

回答by OysterD

There's actually a whole book dedicated (amongst other things) to fastmethods for the computation of \pi: 'Pi and the AGM', by Jonathan and Peter Borwein (available on Amazon).

实际上有一整本书专门（除其他外）用于快速计算 \pi 的方法：'Pi and the AGM'，作者 Jonathan 和 Peter Borwein（可在亚马逊上获得）。

I studied the AGM and related algorithms quite a bit: it's quite interesting (though sometimes non-trivial).

我对 AGM 和相关算法进行了相当多的研究：它非常有趣（尽管有时很重要）。

Note that to implement most modern algorithms to compute \pi, you will need a multiprecision arithmetic library (GMPis quite a good choice, though it's been a while since I last used it).

请注意，要实现大多数现代算法来计算 \pi，您将需要一个多精度算术库（GMP是一个不错的选择，尽管我上次使用它已经有一段时间了）。

The time-complexity of the best algorithms is in O(M(n)log(n)), where M(n) is the time-complexity for the multiplication of two n-bit integers (M(n)=O(n log(n) log(log(n))) using FFT-based algorithms, which are usually needed when computing digits of \pi, and such an algorithm is implemented in GMP).

最佳算法的时间复杂度为 O(M(n)log(n))，其中 M(n) 是两个 n 位整数相乘的时间复杂度 (M(n)=O(n) log(n) log(log(n))) 使用基于 FFT 的算法，这通常在计算 \pi 的数字时需要，并且这种算法在 GMP 中实现）。

Note that even though the mathematics behind the algorithms might not be trivial, the algorithms themselves are usually a few lines of pseudo-code, and their implementation is usually very straightforward (if you chose not to write your own multiprecision arithmetic :-) ).

请注意，尽管算法背后的数学可能并不简单，但算法本身通常只有几行伪代码，而且它们的实现通常非常简单（如果您选择不编写自己的多精度算术 :-) ）。

Answer 6

回答by Tilo

The following answers precisely how to do this in the fastest possible way -- with the least computing effort. Even if you don't like the answer, you have to admit that it is indeed the fastest way to get the value of PI.

下面准确地回答了如何以最快的方式执行此操作 - 使用最少的计算工作。即使你不喜欢这个答案，你也不得不承认，这确实是获得 PI 价值的最快方式。

The FASTESTway to get the value of Pi is:

该最快获得Pi值的方法是：

1) chose your favourite programming language 2) load its Math library 3) and find that Pi is already defined there -- ready for use!

1) 选择你最喜欢的编程语言 2) 加载它的数学库 3) 发现 Pi 已经在那里定义了——可以使用了！

In case you don't have a Math library at hand..

如果您手头没有数学库..

The SECOND FASTESTway (more universal solution) is:

在第二快的方式（更通用的解决方案）是：

look up Pi on the Internet, e.g. here:

在 Internet 上查找 Pi，例如：

http://www.eveandersson.com/pi/digits/1000000(1 million digits .. what's your floating point precision? )

http://www.eveandersson.com/pi/digits/1000000（100万位数字......你的浮点精度是多少？）

or here:

或在这里：

http://3.141592653589793238462643383279502884197169399375105820974944592.com/

or here:

或在这里：

http://en.wikipedia.org/wiki/Pi

It's really fast to find the digits you need for whatever precision arithmetic you would like to use, and by defining a constant, you can make sure that you don't waste precious CPU time.

为您想要使用的任何精度算术找到所需的数字真的很快，并且通过定义一个常量，您可以确保不会浪费宝贵的 CPU 时间。

Not only is this a partly humorous answer, but in reality, if anybody would go ahead and compute the value of Pi in a real application .. that would be a pretty big waste of CPU time, wouldn't it? At least I don't see a real application for trying to re-compute this.

这不仅是一个部分幽默的答案，而且在现实中，如果有人继续计算实际应用程序中 Pi 的值......那将是相当大的 CPU 时间浪费，不是吗？至少我没有看到尝试重新计算这个的真正应用程序。

Dear Moderator: please note that the OP asked: "Fastest Way to get the value of PI"

尊敬的版主：请注意OP提问：“获取PI值的最快方式”

Answer 7

回答by Tyler

The BBP formulaallows you to compute the nth digit - in base 2 (or 16) - without having to even bother with the previous n-1 digits first :)

的BBP公式允许你计算第n个数字-在基体2（或16） - ，而无需甚至与前n-1个位打扰第一:)

Answer 8

回答by Tyler

Instead of defining pi as a constant, I always use acos(-1).

我没有将 pi 定义为常量，而是始终使用acos(-1).

Answer 9

回答by Andrea Ambu

This is a "classic" method, very easy to implement. This implementation in python (not the fastest language) does it:

这是一个“经典”的方法，很容易实现。python中的这个实现（不是最快的语言）做到了：

from math import pi
from time import time


precision = 10**6 # higher value -> higher precision
                  # lower  value -> higher speed

t = time()

calc = 0
for k in xrange(0, precision):
    calc += ((-1)**k) / (2*k+1.)
calc *= 4. # this is just a little optimization

t = time()-t

print "Calculated: %.40f" % calc
print "Constant pi: %.40f" % pi
print "Difference: %.40f" % abs(calc-pi)
print "Time elapsed: %s" % repr(t)

You can find more information here.

您可以在此处找到更多信息。

Anyway, the fastest way to get a precise as-much-as-you-want value of pi in python is:

无论如何，在python中获得精确的你想要的pi值的最快方法是：

from gmpy import pi
print pi(3000) # the rule is the same as 
               # the precision on the previous code

Here is the piece of source for the gmpy pi method, I don't think the code is as useful as the comment in this case:

这是 gmpy pi 方法的源代码，在这种情况下，我认为代码不如注释有用：

static char doc_pi[]="\
pi(n): returns pi with n bits of precision in an mpf object\n\
";

/* This function was originally from netlib, package bmp, by
 * Richard P. Brent. Paulo Cesar Pereira de Andrade converted
 * it to C and used it in his LISP interpreter.
 *
 * Original comments:
 * 
 *   sets mp pi = 3.14159... to the available precision.
 *   uses the gauss-legendre algorithm.
 *   this method requires time o(ln(t)m(t)), so it is slower
 *   than mppi if m(t) = o(t**2), but would be faster for
 *   large t if a faster multiplication algorithm were used
 *   (see comments in mpmul).
 *   for a description of the method, see - multiple-precision
 *   zero-finding and the complexity of elementary function
 *   evaluation (by r. p. brent), in analytic computational
 *   complexity (edited by j. f. traub), academic press, 1976, 151-176.
 *   rounding options not implemented, no guard digits used.
*/
static PyObject *
Pygmpy_pi(PyObject *self, PyObject *args)
{
    PympfObject *pi;
    int precision;
    mpf_t r_i2, r_i3, r_i4;
    mpf_t ix;

    ONE_ARG("pi", "i", &precision);
    if(!(pi = Pympf_new(precision))) {
        return NULL;
    }

    mpf_set_si(pi->f, 1);

    mpf_init(ix);
    mpf_set_ui(ix, 1);

    mpf_init2(r_i2, precision);

    mpf_init2(r_i3, precision);
    mpf_set_d(r_i3, 0.25);

    mpf_init2(r_i4, precision);
    mpf_set_d(r_i4, 0.5);
    mpf_sqrt(r_i4, r_i4);

    for (;;) {
        mpf_set(r_i2, pi->f);
        mpf_add(pi->f, pi->f, r_i4);
        mpf_div_ui(pi->f, pi->f, 2);
        mpf_mul(r_i4, r_i2, r_i4);
        mpf_sub(r_i2, pi->f, r_i2);
        mpf_mul(r_i2, r_i2, r_i2);
        mpf_mul(r_i2, r_i2, ix);
        mpf_sub(r_i3, r_i3, r_i2);
        mpf_sqrt(r_i4, r_i4);
        mpf_mul_ui(ix, ix, 2);
        /* Check for convergence */
        if (!(mpf_cmp_si(r_i2, 0) && 
              mpf_get_prec(r_i2) >= (unsigned)precision)) {
            mpf_mul(pi->f, pi->f, r_i4);
            mpf_div(pi->f, pi->f, r_i3);
            break;
        }
    }

    mpf_clear(ix);
    mpf_clear(r_i2);
    mpf_clear(r_i3);
    mpf_clear(r_i4);

    return (PyObject*)pi;
}

EDIT:I had some problems with cut and paste and indentation, you can find the source here.

编辑：我在剪切、粘贴和缩进方面遇到了一些问题，你可以在这里找到源代码。

Answer 10

回答by Michiel de Mare

If by fastest you mean fastest to type in the code, here's the golfscriptsolution:

如果最快的意思是输入代码最快，这里是Golfscript解决方案：

;''6666,-2%{2+.2/@*\/10.3??2*+}*`1000<~\;

performance 获得 π 值的最快方法是什么？

提问by Chris Jester-Young

采纳答案by nlucaroni

回答by Pat

回答by Leon Bambrick

回答by jon-hanson

回答by OysterD

回答by Tilo

回答by Tyler

回答by Tyler

回答by Andrea Ambu

回答by Michiel de Mare

相关推荐

最近更新

标签

performance 获得 π 值的最快方法是什么？

提问by Chris Jester-Young

采纳答案by nlucaroni

回答by Pat

回答by Leon Bambrick

回答by jon-hanson

回答by OysterD

回答by Tilo

回答by Tyler

回答by Tyler

回答by Andrea Ambu

回答by Michiel de Mare

相关推荐

visual-studio VB.NET - RichTextBox - 将格式应用于所选文本

visual-studio 任何人都知道 Visual Studio 的任何（免费/开源）VI 集成？

visual-studio Visual Studio 解决方案/多项目：如何在多个 C++ 项目之间有效地传播项目属性

visual-studio 使用 Visual Studio 2008 组装、链接、调试和执行 MASM 6.11 汇编代码

相关推荐

最近更新

标签