java Java的快速超越/三角函数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/523531/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 12:44:33  来源:igfitidea点击:

Fast transcendent / trigonometric functions for Java

javaoptimizationmathtrigonometry

提问by Hans-Peter St?rr

Since the trigonometric functions in java.lang.Math are quite slow: is there a library that does a quick and good approximation? It seems possible to do a calculation several times faster without losing much precision. (On my machine a multiplication takes 1.5ns, and java.lang.Math.sin 46ns to 116ns). Unfortunately there is not yet a way to use the hardware functions.

由于 java.lang.Math 中的三角函数很慢:是否有一个库可以快速且良好地近似?似乎可以在不损失太多精度的情况下将计算速度提高几倍。(在我的机器上,乘法需要 1.5ns,而 java.lang.Math.sin 需要 46ns 到 116ns)。不幸的是,目前还没有一种方法可以使用硬件功能。

UPDATE: The functions should be accurate enough, say, for GPS calculations. That means you would need at least 7 decimal digits accuracy, which rules out simple lookup tables. And it should be much faster than java.lang.Math.sin on your basic x86 system. Otherwise there would be no point in it.

更新:这些函数应该足够准确,例如,对于 GPS 计算。这意味着您至少需要 7 位十进制数字的精度,这排除了简单的查找表。它应该比基本 x86 系统上的 java.lang.Math.sin 快得多。否则就没有意义了。

For values over pi/4 Java does some expensive computationsin addition to the hardware functions. It does so for a good reason, but sometimes you care more about the speed than for last bit accuracy.

对于超过 pi/4 的值,除了硬件功能之外,Java还会执行一些昂贵的计算。这样做是有充分理由的,但有时您更关心速度而不是最后一位的准确性。

采纳答案by Darius Bacon

Computer Approximationsby Hart. Tabulates Chebyshev-economizedapproximate formulas for a bunch of functions at different precisions.

Hart 的计算机近似。为一系列不同精度的函数列出切比雪夫节约的近似公式。

Edit:Getting my copy off the shelf, it turned out to be a different bookthat just sounds very similar. Here's a sin function using its tables. (Tested in C since that's handier for me.) I don't know if this will be faster than the Java built-in, but it's guaranteed to be less accurate, at least. :) You may need to range-reduce the argument first; see John Cook's suggestions. The book also has arcsin and arctan.

编辑:从书架上拿下我的副本,结果是一本不同的书,听起来非常相似。这是一个使用其表格的 sin 函数。(在 C 中测试,因为这对我来说更方便。)我不知道这是否会比内置的 Java 快,但至少它保证不那么准确。:) 您可能需要先缩小范围;见约翰库克的建议。这本书还有arcsin和arctan。

#include <math.h>
#include <stdio.h>

// Return an approx to sin(pi/2 * x) where -1 <= x <= 1.
// In that range it has a max absolute error of 5e-9
// according to Hastings, Approximations For Digital Computers.
static double xsin (double x) {
  double x2 = x * x;
  return ((((.00015148419 * x2
             - .00467376557) * x2
            + .07968967928) * x2
           - .64596371106) * x2
          + 1.57079631847) * x;
}

int main () {
  double pi = 4 * atan (1);
  printf ("%.10f\n", xsin (0.77));
  printf ("%.10f\n", sin (0.77 * (pi/2)));
  return 0;
}

回答by finnw

Hereis a collection of low-level tricks for quickly approximating trig functions. There is example code in C which I find hard to follow, but the techniques are just as easily implemented in Java.

是用于快速逼近三角函数的一系列低级技巧。我发现很难遵循 C 中的示例代码,但这些技术在 Java 中也很容易实现。

Here's my equivalent implementation of invsqrt and atan2 in Java.

这是我在 Java 中对 invsqrt 和 atan2 的等效实现。

I could have done something similar for the other trig functions, but I have not found it necessary as profiling showed that only sqrt and atan/atan2 were major bottlenecks.

我可以为其他三角函数做一些类似的事情,但我认为没有必要,因为分析表明只有 sqrt 和 atan/atan2 是主要瓶颈。

public class FastTrig
{
  /** Fast approximation of 1.0 / sqrt(x).
   * See <a href="http://www.beyond3d.com/content/articles/8/">http://www.beyond3d.com/content/articles/8/</a>
   * @param x Positive value to estimate inverse of square root of
   * @return Approximately 1.0 / sqrt(x)
   **/
  public static double
  invSqrt(double x)
  {
    double xhalf = 0.5 * x; 
    long i = Double.doubleToRawLongBits(x);
    i = 0x5FE6EB50C7B537AAL - (i>>1); 
    x = Double.longBitsToDouble(i);
    x = x * (1.5 - xhalf*x*x); 
    return x; 
  }

  /** Approximation of arctangent.
   *  Slightly faster and substantially less accurate than
   *  {@link Math#atan2(double, double)}.
   **/
  public static double fast_atan2(double y, double x)
  {
    double d2 = x*x + y*y;

    // Bail out if d2 is NaN, zero or subnormal
    if (Double.isNaN(d2) ||
        (Double.doubleToRawLongBits(d2) < 0x10000000000000L))
    {
      return Double.NaN;
    }

    // Normalise such that 0.0 <= y <= x
    boolean negY = y < 0.0;
    if (negY) {y = -y;}
    boolean negX = x < 0.0;
    if (negX) {x = -x;}
    boolean steep = y > x;
    if (steep)
    {
      double t = x;
      x = y;
      y = t;
    }

    // Scale to unit circle (0.0 <= y <= x <= 1.0)
    double rinv = invSqrt(d2); // rinv ? 1.0 / hypot(x, y)
    x *= rinv; // x ? cos θ
    y *= rinv; // y ? sin θ, hence θ ? asin y

    // Hack: we want: ind = floor(y * 256)
    // We deliberately force truncation by adding floating-point numbers whose
    // exponents differ greatly.  The FPU will right-shift y to match exponents,
    // dropping all but the first 9 significant bits, which become the 9 LSBs
    // of the resulting mantissa.
    // Inspired by a similar piece of C code at
    // http://www.shellandslate.com/computermath101.html
    double yp = FRAC_BIAS + y;
    int ind = (int) Double.doubleToRawLongBits(yp);

    // Find φ (a first approximation of θ) from the LUT
    double φ = ASIN_TAB[ind];
    double cφ = COS_TAB[ind]; // cos(φ)

    // sin(φ) == ind / 256.0
    // Note that sφ is truncated, hence not identical to y.
    double sφ = yp - FRAC_BIAS;
    double sd = y * cφ - x * sφ; // sin(θ-φ) ≡ sinθ cosφ - cosθ sinφ

    // asin(sd) ? sd + ?sd3 (from first 2 terms of Maclaurin series)
    double d = (6.0 + sd * sd) * sd * ONE_SIXTH;
    double θ = φ + d;

    // Translate back to correct octant
    if (steep) { θ = Math.PI * 0.5 - θ; }
    if (negX) { θ = Math.PI - θ; }
    if (negY) { θ = -θ; }

    return θ;
  }

  private static final double ONE_SIXTH = 1.0 / 6.0;
  private static final int FRAC_EXP = 8; // LUT precision == 2 ** -8 == 1/256
  private static final int LUT_SIZE = (1 << FRAC_EXP) + 1;
  private static final double FRAC_BIAS =
    Double.longBitsToDouble((0x433L - FRAC_EXP) << 52);
  private static final double[] ASIN_TAB = new double[LUT_SIZE];
  private static final double[] COS_TAB = new double[LUT_SIZE];

  static
  {
    /* Populate trig tables */
    for (int ind = 0; ind < LUT_SIZE; ++ ind)
    {
      double v = ind / (double) (1 << FRAC_EXP);
      double asinv = Math.asin(v);
      COS_TAB[ind] = Math.cos(asinv);
      ASIN_TAB[ind] = asinv;
    }
  }
}

回答by Joe

回答by Joe

On the x86 the java.lang.Math sin and cos functions do not directly call the hardware functions because Intel didn't always do such a good job implimenting them. There is a nice explanation in bug #4857011.

在 x86 上,java.lang.Math sin 和 cos 函数不直接调用硬件函数,因为英特尔并不总是在实现它们方面做得很好。在错误 #4857011 中有一个很好的解释。

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4857011

http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4857011

You might want to think hard about an inexact result. It's amusing how often I spend time finding this in others code.

您可能需要认真考虑一个不准确的结果。我花时间在其他代码中发现这一点很有趣。

"But the comment says Sin..."

“但是评论说Sin……”

回答by John D. Cook

I'm surprised that the built-in Java functions would be so slow. Surely the JVM is calling the native trig functions on your CPU, not implementing the algorithms in Java. Are you certain your bottleneck is calls to trig functions and not some surrounding code? Maybe some memory allocations?

我很惊讶内置的 Java 函数会这么慢。当然,JVM 正在调用 CPU 上的本机触发函数,而不是在 Java 中实现算法。你确定你的瓶颈是调用触发函数而不是一些周围的代码吗?也许一些内存分配?

Could you rewrite in C++ the part of your code that does the math? Just calling C++ code to compute trig functions probably wouldn't speed things up, but moving some context too, like an outer loop, to C++ might speed things up.

你能用 C++ 重写你的代码中进行数学运算的部分吗?仅调用 C++ 代码来计算触发函数可能不会加快速度,但将一些上下文(如外循环)移动到 C++ 可能会加快速度。

If you must roll your own trig functions, don't use Taylor series alone. The CORDIC algorithms are much faster unless your argument is very small. You could use CORDIC to get started, then polish the result with a short Taylor series. See this StackOverflow question on how to implement trig functions.

如果您必须推出自己的三角函数,请不要单独使用泰勒级数。除非您的参数非常小,否则 CORDIC 算法要快得多。您可以使用 CORDIC 开始,然后使用简短的泰勒级数完善结果。请参阅有关如何实现触发函数的StackOverflow 问题。

回答by Pierre

You could pre-store your sin and cos in an array if you only need some approximate values. For example, if you want to store the values from 0° to 360°:

如果您只需要一些近似值,您可以将 sin 和 cos 预先存储在一个数组中。例如,如果要存储从 0° 到 360° 的值:

double sin[]=new double[360];
for(int i=0;i< sin.length;++i) sin[i]=Math.sin(i/180.0*Math.PI):

you then use this array using degrees/integers instead of radians/double.

然后使用度/整数而不是弧度/双来使用这个数组。

回答by axelclk

Trigonometric functions are the classical example for a lookup table. See the excellent

三角函数是查找表的经典示例。看到优秀

If you're searching a library for J2ME you can try:

如果您正在搜索 J2ME 库,您可以尝试:

  • the Fixed Point Integer Math LibraryMathFP

回答by patros

I haven't heard of any libs, probably because it's rare enough to see trig heavy Java apps. It's also easy enough to roll your own with JNI (same precision, better performance), numerical methods (variable precision / performance ) or a simple approximation table.

我还没有听说过任何库,可能是因为很少能看到触发重型 Java 应用程序。使用 JNI(相同的精度,更好的性能)、数值方法(变量精度/性能)或简单的近似表也很容易。

As with any optimization, best to test that these functions are actually a bottleneck before bothering to reinvent the wheel.

与任何优化一样,最好先测试这些功能是否确实是瓶颈,然后再重新发明轮子。

回答by Peter Lawrey

The java.lang.Math functions call the hardware functions. There should be simple appromiations you can make but they won't be as accurate.

java.lang.Math 函数调用硬件函数。您应该可以做出简单的认可,但它们不会那么准确。

On my labtop, sin and cos takes about 144 ns.

在我的实验室中,sin 和 cos 大约需要 144 ns。

回答by Peter Lawrey

In the sin/cos test I was performing for integers zero to one million. I assume that 144 ns is not fast enough for you.

在 sin/cos 测试中,我对整数进行了零到一百万的测试。我认为 144 ns 对您来说还不够快。

Do you have a specific requirement for the speed you need?

您对所需的速度有特定要求吗?

Can you qualify your requirement in terms of time per operation which is satisfactory?

您能否在每次操作的时间方面满足您的要求?