Java中的浮点数和双精度数据类型

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27598078/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 04:40:06  来源:igfitidea点击:

Float and double datatype in Java

javafloating-pointdoubleieee-754

提问by Leo

The float data type is a single-precision 32-bit IEEE 754 floating point and the double data type is a double-precision 64-bit IEEE 754 floating point.

float 数据类型是单精度 32 位 IEEE 754 浮点数,double 数据类型是双精度 64 位 IEEE 754 浮点数。

What does it mean? And when should I use float instead of double or vice-versa?

这是什么意思?什么时候应该使用 float 而不是 double ,反之亦然?

采纳答案by Makoto

The Wikipedia pageon it is a good place to start.

维基百科页面上这是一个良好的开端。

To sum up:

总结:

  • floatis represented in 32 bits, with 1 sign bit, 8 bits of exponent, and 23 bits of the significand (or what follows from a scientific-notation number: 2.33728*1012; 33728 is the significand).

  • doubleis represented in 64 bits, with 1 sign bit, 11 bits of exponent, and 52 bits of significand.

  • float用 32 位表示,其中 1 个符号位、8 位指数和 23 位有效数(或遵循科学记数法的数字:2.33728*10 12;33728 是有效数)。

  • double用 64 位表示,其中 1 个符号位、11 位指数和 52 位有效数。

By default, Java uses doubleto represent its floating-point numerals (so a literal 3.14is typed double). It's also the data type that will give you a much larger number range, so I would strongly encourage its use over float.

默认情况下,Java 使用double浮点数来表示它(因此3.14输入的是文字double)。它也是可以为您提供更大数字范围的数据类型,因此我强烈建议在float.

There may be certain libraries that actually force your usage of float, but in general - unless you can guarantee that your result will be small enough to fit in float's prescribed range, then it's best to opt with double.

有可能是某些库,实际上迫使你的使用float,但一般-除非你能保证你的结果将是小到足以适应float规定范围,那么最好与选择double

If you require accuracy - for instance, you can't have a decimal value that is inaccurate (like 1/10 + 2/10), or you're doing anythingwith currency (for example, representing $10.33 in the system), then use a BigDecimal, which can support an arbitrary amount of precision and handle situations like that elegantly.

如果您需要准确性 - 例如,您不能有不准确的十进制值(如1/10 + 2/10),或者您正在对货币进行任何操作(例如,在系统中表示 10.33 美元),则使用BigDecimal,它可以支持任意数量的精度并优雅地处理这样的情况。

回答by Henry

A float gives you approx. 6-7 decimal digits precision while a double gives you approx. 15-16. Also the range of numbers is larger for double.

一个浮点数给你大约。6-7 位十进制数字精度,而双精度为您​​提供大约。15-16。此外,double 的数字范围更大。

A double needs 8 bytes of storage space while a float needs just 4 bytes.

double 需要 8 个字节的存储空间,而 float 只需要 4 个字节。

回答by Ye Win

Floating-point numbers, also known as real numbers, are used when evaluating expressions that require fractional precision. For example, calculations such as square root, or transcendentals such as sine and cosine, result in a value whose precision requires a floating-point type. Java implements the standard (IEEE–754) set of floatingpoint types and operators. There are two kinds of floating-point types, float and double, which represent single- and double-precision numbers, respectively. Their width and ranges are shown here:

浮点数,也称为实数,用于计算需要小数精度的表达式。例如,诸如平方根之类的计算或诸如正弦和余弦之类的超越数会产生一个精度需要浮点类型的值。Java 实现了标准 (IEEE–754) 浮点类型和运算符集。有两种浮点类型,float 和 double,分别表示单精度数和双精度数。它们的宽度和范围如下所示:



   Name     Width in Bits   Range 
    double  64              1 .7e–308 to 1.7e+308
    float   32              3 .4e–038 to 3.4e+038


float


漂浮

The type float specifies a single-precision value that uses 32 bits of storage. Single precision is faster on some processors and takes half as much space as double precision, but will become imprecise when the values are either very large or very small. Variables of type float are useful when you need a fractional component, but don't require a large degree of precision.

类型 float 指定使用 32 位存储的单精度值。单精度在某些处理器上速度更快,占用的空间是双精度的一半,但当值非常大或非常小时时,就会变得不精确。当您需要小数部分但不需要很高的精度时,float 类型的变量很有用。

Here are some example float variable declarations:

以下是一些浮点变量声明示例:

float hightemp, lowtemp;

浮动高温,低温;


double


双倍的

Double precision, as denoted by the double keyword, uses 64 bits to store a value. Double precision is actually faster than single precision on some modern processors that have been optimized for high-speed mathematical calculations. All transcendental math functions, such as sin( ), cos( ), and sqrt( ), return double values. When you need to maintain accuracy over many iterative calculations, or are manipulating large-valued numbers, double is the best choice.

双精度,如 double 关键字所示,使用 64 位来存储值。在一些针对高速数学计算进行了优化的现代处理器上,双精度实际上比单精度更快。所有超越数学函数,例如 sin( )、cos( ) 和 sqrt( ),都返回双精度值。当您需要在多次迭代计算中保持准确性,或者处理大数值时,double 是最佳选择。

回答by Rubal

According to the IEEE standards, float is a 32 bit representation of a real number while double is a 64 bit representation.

根据 IEEE 标准,float 是实数的 32 位表示,而 double 是 64 位表示。

In Java programs we normally mostly see the use of double data type. It's just to avoid overflows as the range of numbers that can be accommodated using the double data type is more that the range when float is used.

在 Java 程序中,我们通常会看到使用 double 数据类型。这只是为了避免溢出,因为使用 double 数据类型可以容纳的数字范围大于使用 float 时的范围。

Also when high precision is required, the use of double is encouraged. Few library methods that were implemented a long time ago still requires the use of float data type as a must (that is only because it was implemented using float, nothing else!).

此外,当需要高精度时,鼓励使用双精度。很久以前实现的库方法很少仍然需要使用 float 数据类型作为必须的(那只是因为它是使用 float 实现的,没有别的!)。

But if you are certain that your program requires small numbers and an overflow won't occur with your use of float, then the use of float will largely improve your space complexity as floats require half the memory as required by double.

但是,如果您确定您的程序需要小数并且使用 float 不会发生溢出,那么使用 float 将大大提高您的空间复杂度,因为 float 需要的内存只有 double 所需的一半。

回答by Raymond Wachaga

Java seems to have a bias towards using double for computations nonetheless:

尽管如此,Java 似乎还是倾向于使用 double 进行计算:

Case in point the program I wrote earlier today, the methods didn't work when I used float, but now work great when I substituted float with double (in the NetBeans IDE):

例如我今天早些时候编写的程序,当我使用 float 时,这些方法不起作用,但现在当我用 double 替换 float 时效果很好(在 NetBeans IDE 中):

package palettedos;
import java.util.*;

class Palettedos{
    private static Scanner Z = new Scanner(System.in);
    public static final double pi = 3.142;

    public static void main(String[]args){
        Palettedos A = new Palettedos();
        System.out.println("Enter the base and height of the triangle respectively");
        int base = Z.nextInt();
        int height = Z.nextInt();
        System.out.println("Enter the radius of the circle");
        int radius = Z.nextInt();
        System.out.println("Enter the length of the square");
        long length = Z.nextInt();
        double tArea = A.calculateArea(base, height);
        double cArea = A.calculateArea(radius);
        long sqArea = A.calculateArea(length);
        System.out.println("The area of the triangle is\t" + tArea);
        System.out.println("The area of the circle is\t" + cArea);
        System.out.println("The area of the square is\t" + sqArea);
    }

    double calculateArea(int base, int height){
        double triArea = 0.5*base*height;
        return triArea;
    }

    double calculateArea(int radius){
        double circArea = pi*radius*radius;
        return circArea;
    }

    long calculateArea(long length){
        long squaArea = length*length;
        return squaArea;
    }
}

回答by acohen

This example illustrates how to extract the sign (the leftmost bit), exponent (the 8 following bits) and mantissa (the 23 rightmost bits) from a float in Java.

此示例说明了如何从 Java 中的浮点数中提取符号(最左边的位)、指数(接下来的 8 位)和尾数(最右边的 23 位)。

int bits = Float.floatToIntBits(-0.005f);
int sign = bits >>> 31;
int exp = (bits >>> 23 & ((1 << 8) - 1)) - ((1 << 7) - 1);
int mantissa = bits & ((1 << 23) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
  Float.intBitsToFloat((sign << 31) | (exp + ((1 << 7) - 1)) << 23 | mantissa));

The same approach can be used for double's (11 bit exponent and 52 bit mantissa).

相同的方法可用于双精度(11 位指数和 52 位尾数)。

long bits = Double.doubleToLongBits(-0.005);
long sign = bits >>> 63;
long exp = (bits >>> 52 & ((1 << 11) - 1)) - ((1 << 10) - 1);
long mantissa = bits & ((1L << 52) - 1);
System.out.println(sign + " " + exp + " " + mantissa + " " +
  Double.longBitsToDouble((sign << 63) | (exp + ((1 << 10) - 1)) << 52 | mantissa));

Credit: http://s-j.github.io/java-float/

信用:http: //sj.github.io/java-float/

回答by Himanshu Singh

This will give error:

这将给出错误:

public class MyClass {
    public static void main(String args[]) {
        float a = 0.5;
    }
}

/MyClass.java:3: error: incompatible types: possible lossy conversion from double to float float a = 0.5;

/MyClass.java:3: 错误:不兼容的类型:从 double 到 float float a = 0.5 的可能有损转换;

This will work perfectly fine

这将工作得很好

public class MyClass {
    public static void main(String args[]) {
        double a = 0.5;
    }
}

This will also work perfectly fine

这也可以正常工作

public class MyClass {
    public static void main(String args[]) {
        float a = (float)0.5;
    }
}

Reason: Java by default stores real numbers as double to ensure higher precision.

原因:Java 默认将实数存储为 double 以确保更高的精度。

Double takes more space but more precise during computation and float takes less space but less precise.

Double 在计算过程中占用更多空间但更精确,而 float 占用更少空间但不太精确。

回答by boi yeet

You should use double instead of float for precise calculations, and float instead of double when using less accurate calculations. Float contains only decimal numbers, but double contains an IEEE754 double-precision floating point number, making it easier to contain and computate numbers more accurately. Hope this helps.

您应该使用 double 代替 float 进行精确计算,而在使用不太准确的计算时应使用 float 代替 double。Float 只包含十进制数,而 double 包含一个 IEEE754 双精度浮点数,从而更容易包含和更准确地计算数字。希望这可以帮助。