获得π价值的最快方法是什么？

e4yzc0pl 于 2022-09-21 发布在 Unix

关注(0)|答案(15)|浏览(328)

我正在寻找获得π价值的最快方法，作为个人挑战。更具体地说，我使用的方法不涉及使用#define常量，如M_PI，或在中硬编码数字。

下面的程序测试了我所知道的各种方法。理论上，内联汇编版本是最快的选择，尽管显然不是可移植的。我已经将其作为基准包含进来，以便与其他版本进行比较。在我的测试中，内置了4 * atan(1)的版本在GCC 4.2上速度最快，因为它会将atan(1)自动折叠成一个常量。指定-fno-builtin时，atan2(0, -1)版本最快。

以下是主要测试程序(pitimes.c)：


# include <math.h>

# include <stdio.h>

# include <time.h>

# define ITERS 10000000

# define TESTWITH(x) {

    diff = 0.0;                                                             
    time1 = clock();                                                        
    for (i = 0; i < ITERS; ++i)                                             
        diff += (x) - M_PI;                                                 
    time2 = clock();                                                        
    printf("%st=> %e, time => %fn", #x, diff, diffclock(time2, time1));   
}

static inline double
diffclock(clock_t time1, clock_t time0)
{
    return (double) (time1 - time0) / CLOCKS_PER_SEC;
}

int
main()
{
    int i;
    clock_t time1, time2;
    double diff;

    /* Warmup. The atan2 case catches GCC's atan folding (which would
     * optimise the ``4 * atan(1) - M_PI'' to a no-op), if -fno-builtin
     * is not used. */
    TESTWITH(4 * atan(1))
    TESTWITH(4 * atan2(1, 1))

# if defined(__GNUC__) && (defined(__i386__) || defined(__amd64__))

    extern double fldpi();
    TESTWITH(fldpi())

# endif

    /* Actual tests start here. */
    TESTWITH(atan2(0, -1))
    TESTWITH(acos(-1))
    TESTWITH(2 * asin(1))
    TESTWITH(4 * atan2(1, 1))
    TESTWITH(4 * atan(1))

    return 0;
}

以及仅适用于x86和x64系统的内联汇编内容(fldpi.c)：

double
fldpi()
{
    double pi;
    asm("fldpi" : "=t" (pi));
    return pi;
}

以及构建我正在测试的所有配置的构建脚本(build.sh)：


# !/bin/sh

gcc -O3 -Wall -c           -m32 -o fldpi-32.o fldpi.c
gcc -O3 -Wall -c           -m64 -o fldpi-64.o fldpi.c

gcc -O3 -Wall -ffast-math  -m32 -o pitimes1-32 pitimes.c fldpi-32.o
gcc -O3 -Wall              -m32 -o pitimes2-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -fno-builtin -m32 -o pitimes3-32 pitimes.c fldpi-32.o -lm
gcc -O3 -Wall -ffast-math  -m64 -o pitimes1-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall              -m64 -o pitimes2-64 pitimes.c fldpi-64.o -lm
gcc -O3 -Wall -fno-builtin -m64 -o pitimes3-64 pitimes.c fldpi-64.o -lm

除了在不同的编译器标志之间测试(我也比较了32位和64位，因为优化是不同的)，我还尝试了改变测试的顺序。但尽管如此，atan2(0, -1)版本仍然每次都位居榜首。

unix

来源：https://stackoverflow.com/questions/19/what-is-the-fastest-way-to-get-the-value-of-%cf%80

15条答案

按热度按时间

h4cxqtbf1#

如上所述，Monte Carlo method应用了一些伟大的概念，但显然，它不是最快的，也不是遥不可及的，也不是任何合理的衡量标准。此外，这完全取决于你想要的是什么样的准确性。我所知道的最快的π是硬编码的数字。看看Pi和Pi[PDF]，有很多公式。

这里有一种快速收敛的方法--每次迭代大约14位数字。目前速度最快的应用程序PiFast在FFT中使用了此公式。我只写公式，因为代码很简单。这个公式在Ramanujan and discovered by Chudnovsky几乎找到了。这实际上是他计算数十亿位数字的方法--所以这是一个不容忽视的方法。公式将很快溢出，由于我们正在划分阶乘，因此推迟计算以删除项将是有利的。

哪里,

下面是Brent–Salamin algorithm。维基百科提到，当a和b“足够接近”时，(a+b)²/4t将是π的近似值。我不确定“足够接近”是什么意思，但从我的测试来看，一次迭代得到2位数，两次迭代得到7位数，三次迭代得到15位数，当然这是双精度的，所以根据它的表示可能会有错误，TRUE计算可能会更准确。

let pi_2 iters =
    let rec loop_ a b t p i =
        if i = 0 then a,b,t,p
        else
            let a_n = (a +. b) /. 2.0 
            and b_n = sqrt (a*.b)
            and p_n = 2.0 *. p in
            let t_n = t -. (p *. (a -. a_n) *. (a -. a_n)) in
            loop_ a_n b_n t_n p_n (i - 1)
    in 
    let a,b,t,p = loop_ (1.0) (1.0 /. (sqrt 2.0)) (1.0/.4.0) (1.0) iters in
    (a +. b) *. (a +. b) /. (4.0 *. t)

最后，来点圆周率高尔夫(800位)怎么样？160个字符！

int a=10000,b,c=2800,d,e,f[2801],g;main(){for(;b-c;)f[b++]=a/5;for(;d=0,g=c*2;c-=14,printf("%.4d",e+d/a),e=d%a)for(b=c;d+=f[b]*a,f[b]=d%--g,d/=g--,--b;d*=b);}

获得π价值的最快方法是什么？

15条答案

用D.计算编译时的PI

相关问题

热门标签

最新问答