C++ 比 double 更精确的浮点数据类型?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15659668/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
More Precise Floating point Data Types than double?
提问by Cool_Coder
In my project I have to compute division, multiplication, subtraction, addition on a matrix of double
elements.
The problem is that when the size of matrix increases the accuracy of my output is drastically getting affected.
Currently I am using double
for each element which I believe uses 8 bytes
of memory & has accuracy of 16 digits
irrespective of decimal position.
Even for large size of matrix the memory occupied by all the elements is in the range of few kilobytes. So I can afford to use datatypes
which require more memory.
So I wanted to know which data type is more precise than double
.
I tried searching in some books & I could find long double
.
But I dont know what is its precision.
And what if I want more precision than that?
在我的项目中,我必须计算double
元素矩阵的除法、乘法、减法、加法。问题是当矩阵的大小增加时,我的输出的准确性会受到极大的影响。目前我正在使用double
每个元素,我相信这些元素会使用8 bytes
内存并且16 digits
无论小数位如何都具有准确性。即使对于大尺寸的矩阵,所有元素占用的内存也在几千字节的范围内。所以我可以负担得起datatypes
需要更多内存的使用。所以我想知道哪种数据类型比double
. 我尝试在一些书籍中搜索,我可以找到long double
. 但我不知道它的精度是多少。如果我想要比这更高的精度怎么办?
回答by Potatoswatter
According to Wikipedia, 80-bit "Intel" IEEE 754 extended-precisionlong double
, which is 80 bits padded to 16 bytes in memory, has 64 bits mantissa, with no implicit bit, which gets you 19.26 decimal digits. This has been the almost universal standard for long double
for ages, but recently things have started to change.
根据维基百科,80 位“英特尔” IEEE 754扩展精度long double
,即 80 位填充到内存中的 16 个字节,具有 64 位尾数,没有隐式位,可以得到 19.26 位十进制数字。多年来,这几乎是普遍的标准long double
,但最近情况开始发生变化。
The newer 128-bit quad-precisionformat has 112 mantissa bits plus an implicit bit, which gets you 34 decimal digits. GCC implements this as the __float128
type and there is (if memory serves) a compiler option to set long double
to it.
较新的 128 位四精度格式具有 112 个尾数位和一个隐式位,它为您提供 34 位十进制数字。GCC 将其实现为__float128
类型,并且(如果有记忆的话)有一个编译器选项可以设置long double
为它。
回答by Telgin
Floating point data types with greater precision than double
are going to depend on your compiler and architecture.
精度更高的浮点数据类型double
取决于您的编译器和体系结构。
In order to get more than double
precision, you may need to rely on some math library that supports arbitrary precision calculations. These probably won't be fast though.
为了获得更高的double
精度,您可能需要依赖一些支持任意精度计算的数学库。不过这些可能不会很快。
回答by ogni42
You might want to consider the sequence of operations, i.e. do the additions in an ordered sequence starting with the smallest values first. This will increase overall accuracy of the results using the same precision in the mantissa:
您可能需要考虑操作顺序,即首先从最小值开始按有序顺序进行加法。这将使用尾数中的相同精度提高结果的整体准确性:
1e00 + 1e-16 + ... + 1e-16 (1e16 times) = 1e00
1e-16 + ... + 1e-16 (1e16 times) + 1e00 = 2e00
The point is that adding small numbers to a large number will make them disappear. So the latter approach reduces the numerical error
关键是将小数添加到大数会使它们消失。所以后一种方法减少了数值误差