Java 精度损失 - int -> float 或 double
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2781086/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Loss of precision - int -> float or double
提问by stan
I have an exam question I am revising for and the question is for 4 marks.
我有一个我正在修改的考试问题,该问题是 4 分。
"In java we can assign a int to a double or a float". Will this ever lose information and why?
“在 Java 中,我们可以将 int 分配给 double 或 float”。这会丢失信息吗?为什么?
I have put that because ints are normally of fixed length or size - the precision for storing data is finite, where storing information in floating point can be infinite, essentially we lose information because of this
我这么说是因为整数通常是固定长度或大小 - 存储数据的精度是有限的,在浮点中存储信息可以是无限的,基本上我们因此而丢失信息
Now I am a little sketchy as to whether or not I am hitting the right areas here. I very sure it will lose precision but I can't exactly put my finger on why. Can I get some help, please?
现在我有点不知道我是否在这里击中了正确的区域。我很确定它会失去精度,但我不能确切地指出原因。我可以得到一些帮助吗?
回答by Dead account
In Java Integer uses 32 bits to represent its value.
在 Java 中整数使用 32 位来表示它的值。
In Java a FLOAT uses a 23 bit mantissa, so integers greater than 2^23 will have their least significant bits truncated. For example 33554435 (or 0x200003) will be truncated to around 33554432 +/- 4
在 Java 中,FLOAT 使用 23 位尾数,因此大于 2^23 的整数将截断其最低有效位。例如 33554435(或 0x200003)将被截断为大约 33554432 +/- 4
In Java a DOUBLE uses a 52 bit mantissa, so will be able to represent a 32bit integer without lost of data.
在 Java 中,DOUBLE 使用 52 位尾数,因此能够表示 32 位整数而不会丢失数据。
See also "Floating Point" on wikipedia
另见维基百科上的“浮点”
回答by Adam Batkin
There are two reasons that assigning an int to a double or a float might lose precision:
将 int 分配给 double 或 float 可能会失去精度有两个原因:
- There are certain numbers that just can't be represented as a double/float, so they end up approximated
- Large integer numbers may contain too much precision in the lease-significant digits
- 有些数字不能表示为双精度/浮点数,因此它们最终近似
- 大整数可能在租用有效数字中包含太多精度
回答by Michael Borgwardt
No, float
and double
are fixed-length too - they just use their bits differently. Read more about how exactly they work in the Floating-Poing Guide.
不,float
而且double
也是固定长度的——它们只是以不同的方式使用它们的位。在Floating-Poing 指南 中阅读更多关于它们是如何工作的。
Basically, you cannot lose precision when assigning an int
to a double
, because double
has 52 bits of precision, which is enough to hold all int
values. But float
only has 23 bits of precision, so it cannot exactly represent all int
values that are larger than about 2^23.
基本上,将 an 分配int
给 a时不会丢失精度double
,因为double
具有 52 位精度,足以容纳所有int
值。但float
只有 23 位精度,因此它不能准确表示int
大于大约 2^23 的所有值。
回答by polygenelubricants
Here's what JLS has to say about the matter (in a non-technical discussion).
以下是 JLS 对此事的看法(在非技术讨论中)。
JLS 5.1.2 Widening primitive conversion
JLS 5.1.2 扩大原语转换
The following 19 specific conversions on primitive types are called the widening primitive conversions:
int
tolong
,float
, ordouble
- (rest omitted)
Conversion of an
int
or along
value tofloat
, or of along
value todouble
, may result in loss of precision-- that is, the result may lose some of the least significant bits of the value. In this case, the resulting floating-point value will be a correctly rounded version of the integer value, using IEEE 754 round-to-nearest mode.Despite the fact that loss of precision may occur, widening conversions among primitive types never result in a run-time exception.
Here is an example of a widening conversion that loses precision:
class Test { public static void main(String[] args) { int big = 1234567890; float approx = big; System.out.println(big - (int)approx); } }
which prints:
-46
thus indicating that information was lost during the conversion from type
int
to typefloat
because values of typefloat
are not precise to nine significant digits.
以下 19 种特定于原始类型的转换称为扩展原始类型转换:
int
到long
,float
, 或double
- (其余省略)
的转换
int
或一个long
值float
,或者一个的long
值double
,可能导致精度的损失-也就是,其结果可能会丢失一些值的至少显著位。在这种情况下,生成的浮点值将是整数值的正确舍入版本,使用 IEEE 754 舍入到最近模式。尽管可能会发生精度损失,但扩大原始类型之间的转换永远不会导致运行时异常。
下面是一个失去精度的扩大转换的例子:
class Test { public static void main(String[] args) { int big = 1234567890; float approx = big; System.out.println(big - (int)approx); } }
打印:
-46
因此表明在从类型
int
到类型的转换过程中信息丢失了,float
因为类型的值float
不精确到九位有效数字。
回答by dan04
It's not necessary to know the internal layout of floating-point numbers. All you need is the pigeonhole principle and the knowledge that int
and float
are the same size.
没有必要知道浮点数的内部布局。所有你需要的是鸽巢原理和知识int
,并float
具有相同的尺寸。
int
is a 32-bit type, for which every bit pattern represents a distinct integer, so there are 2^32int
values.float
is a 32-bit type, so it has at most 2^32 distinct values.- Some
float
s represent non-integers, so there are fewerthan 2^32float
values that represent integers. - Therefore, different
int
values will be converted to the samefloat
(=loss of precision).
int
是 32 位类型,每个位模式代表一个不同的整数,因此有 2^32 个int
值。float
是 32 位类型,因此它最多有 2^32 个不同的值。- 有些
float
s 代表非整数,因此代表整数的值少于2^32float
。 - 因此,不同的
int
值将转换为相同的值float
(=精度损失)。
Similar reasoning can be used with long
and double
.
类似的推理可以与long
和一起使用double
。
回答by Ian MacMillan
Possibly the clearest explanation I've seen: http://www.ibm.com/developerworks/java/library/j-math2/index.htmlthe ULP or unit of least precision defines the precision available between any two float values. As these values increase the available precision decreases. For example: between 1.0 and 2.0 inclusive there are 8,388,609 floats, between 1,000,000 and 1,000,001 there are 17. At 10,000,000 the ULP is 1.0, so above this value you soon have multiple integeral values mapping to each available float, hence the loss of precision.
可能是我见过的最清楚的解释:http: //www.ibm.com/developerworks/java/library/j-math2/index.htmlULP 或最小精度单位定义了任何两个浮点值之间可用的精度。随着这些值的增加,可用精度降低。例如:在 1.0 和 2.0 之间有 8,388,609 个浮点数,在 1,000,000 和 1,000,001 之间有 17 个。在 10,000,000 时,ULP 是 1.0,所以高于这个值你很快就会有多个整数值,因此映射到每个可用浮点数的损失。
回答by Sinkrad
Your intuition is correct, you MAY loose precision when converting int
to float
. However it not as simple as presented in most other answers.
您的直觉是正确的,转换int
为float
. 然而,它并不像大多数其他答案中介绍的那么简单。
In Java a FLOAT uses a 23 bit mantissa, so integers greater than 2^23 will have their least significant bits truncated. (from a post on this page)
在 Java 中,FLOAT 使用 23 位尾数,因此大于 2^23 的整数将截断其最低有效位。(来自此页面上的帖子)
Not true.
Example: here is an integer that is greater than 2^23 that converts to a float with no loss:
不对。
示例:这是一个大于 2^23 的整数,它可以无损失地转换为浮点数:
int i = 33_554_430 * 64; // is greater than 2^23 (and also greater than 2^24); i = 2_147_483_520
float f = i;
System.out.println("result: " + (i - (int) f)); // Prints: result: 0
System.out.println("with i:" + i + ", f:" + f);//Prints: with i:2_147_483_520, f:2.14748352E9
Therefore, it is not true that integers greater than 2^23 will have their least significant bits truncated.
因此,大于 2^23 的整数的最低有效位将被截断是不正确的。
The best explanation I found is here:
A float in Java is 32-bit and is represented by:
sign * mantissa * 2^exponent
sign * (0 to 33_554_431) * 2^(-125 to +127)
Source: http://www.ibm.com/developerworks/java/library/j-math2/index.html
我找到的最佳解释是:
Java 中的浮点数是 32 位,表示为:
符号 * 尾数 * 2^指数
符号 * (0 到 33_554_431) * 2^(-125 到 +127)
来源:http:/ /www.ibm.com/developerworks/java/library/j-math2/index.html
Why is this an issue?
It leaves the impression that you can determine whether there is a loss of precision from int to float just by looking at how largethe int is.
I have especially seen Java exam questions where one is asked whether a large int would convert to a float with no loss.
为什么这是一个问题?
它给人的印象是,您可以通过查看int 的大小来确定从 int 到 float 是否存在精度损失。
我特别看到过 Java 考试题,其中会问一个大的 int 是否会在没有损失的情况下转换为浮点数。
Also, sometimes people tend to think that there will be loss of precision from int to float:
when an int is larger than: 1_234_567_890 not true(see counter-example above)
when an int is larger than: 2 exponent 23 (equals: 8_388_608) not true
when an int is larger than: 2 exponent 24 (equals: 16_777_216) not true
此外,有时人们倾向于认为从 int 到 float 会损失精度:
当 int 大于:1_234_567_890 时 不正确(参见上面的反例)
当 int 大于:2 指数 23(等于:8_388_608 ) 不成立
时的int大于:2指数24(等于:16_777_216) 不是真正的
Conclusion
Conversions from sufficiently large ints to floats MAY lose precision.
It is not possible to determine whether there will be loss just by lookingat how large the int is (i.e. without trying to go deeper into the actual float representation).
结论
从足够大的整数到浮点数的转换可能会失去精度。
仅通过查看int 的大小(即不尝试深入了解实际的浮点表示)无法确定是否会有损失。
回答by HesNotTheStig
For these examples, I'm using Java.
对于这些示例,我使用的是 Java。
Use a function like this to check for loss of precision when casting from int to float
使用这样的函数来检查从 int 转换为 float 时的精度损失
static boolean checkPrecisionLossToFloat(int val)
{
if(val < 0)
{
val = -val;
}
// 8 is the bit-width of the exponent for single-precision
return Integer.numberOfLeadingZeros(val) + Integer.numberOfTrailingZeros(val) < 8;
}
Use a function like this to check for loss of precision when casting from long to double
使用这样的函数来检查从 long 转换为 double 时的精度损失
static boolean checkPrecisionLossToDouble(long val)
{
if(val < 0)
{
val = -val;
}
// 11 is the bit-width for the exponent in double-precision
return Long.numberOfLeadingZeros(val) + Long.numberOfTrailingZeros(val) < 11;
}
Use a function like this to check for loss of precision when casting from long to float
使用这样的函数来检查从 long 转换到 float 时的精度损失
static boolean checkPrecisionLossToFloat(long val)
{
if(val < 0)
{
val = -val;
}
// 8 + 32
return Long.numberOfLeadingZeros(val) + Long.numberOfTrailingZeros(val) < 40;
}
For each of these functions, returning true means that casting that integral value to the floating point value will result in a loss of precision.
对于这些函数中的每一个,返回 true 意味着将该整数值转换为浮点值将导致精度损失。
Casting to float will lose precision if the integral value has more than 24 significant bits.
如果整数值的有效位超过 24 位,则转换为 float 将失去精度。
Casting to double will lose precision if the integral value has more than 53 significant bits.
如果整数值超过 53 个有效位,则转换为 double 将失去精度。