C++ 比 double 更精确的浮点数据类型？

Question

提问by Cool_Coder

In my project I have to compute division, multiplication, subtraction, addition on a matrix of doubleelements. The problem is that when the size of matrix increases the accuracy of my output is drastically getting affected. Currently I am using doublefor each element which I believe uses 8 bytesof memory & has accuracy of 16 digitsirrespective of decimal position. Even for large size of matrix the memory occupied by all the elements is in the range of few kilobytes. So I can afford to use datatypeswhich require more memory. So I wanted to know which data type is more precise than double. I tried searching in some books & I could find long double. But I dont know what is its precision. And what if I want more precision than that?

在我的项目中，我必须计算double元素矩阵的除法、乘法、减法、加法。问题是当矩阵的大小增加时，我的输出的准确性会受到极大的影响。目前我正在使用double每个元素，我相信这些元素会使用8 bytes内存并且16 digits无论小数位如何都具有准确性。即使对于大尺寸的矩阵，所有元素占用的内存也在几千字节的范围内。所以我可以负担得起datatypes需要更多内存的使用。所以我想知道哪种数据类型比double. 我尝试在一些书籍中搜索，我可以找到long double. 但我不知道它的精度是多少。如果我想要比这更高的精度怎么办？

Answer 1

回答by Potatoswatter

According to Wikipedia, 80-bit "Intel" IEEE 754 extended-precisionlong double, which is 80 bits padded to 16 bytes in memory, has 64 bits mantissa, with no implicit bit, which gets you 19.26 decimal digits. This has been the almost universal standard for long doublefor ages, but recently things have started to change.

根据维基百科，80 位“英特尔” IEEE 754扩展精度long double，即 80 位填充到内存中的 16 个字节，具有 64 位尾数，没有隐式位，可以得到 19.26 位十进制数字。多年来，这几乎是普遍的标准long double，但最近情况开始发生变化。

The newer 128-bit quad-precisionformat has 112 mantissa bits plus an implicit bit, which gets you 34 decimal digits. GCC implements this as the __float128type and there is (if memory serves) a compiler option to set long doubleto it.

较新的 128 位四精度格式具有 112 个尾数位和一个隐式位，它为您提供 34 位十进制数字。GCC 将其实现为__float128类型，并且（如果有记忆的话）有一个编译器选项可以设置long double为它。

Answer 2

回答by Telgin

Floating point data types with greater precision than doubleare going to depend on your compiler and architecture.

精度更高的浮点数据类型double取决于您的编译器和体系结构。

In order to get more than doubleprecision, you may need to rely on some math library that supports arbitrary precision calculations. These probably won't be fast though.

为了获得更高的double精度，您可能需要依赖一些支持任意精度计算的数学库。不过这些可能不会很快。

Answer 3

回答by ogni42

You might want to consider the sequence of operations, i.e. do the additions in an ordered sequence starting with the smallest values first. This will increase overall accuracy of the results using the same precision in the mantissa:

您可能需要考虑操作顺序，即首先从最小值开始按有序顺序进行加法。这将使用尾数中的相同精度提高结果的整体准确性：

1e00 + 1e-16 + ... + 1e-16 (1e16 times) = 1e00
1e-16 + ... + 1e-16 (1e16 times) + 1e00 = 2e00

The point is that adding small numbers to a large number will make them disappear. So the latter approach reduces the numerical error

关键是将小数添加到大数会使它们消失。所以后一种方法减少了数值误差

Answer 4

回答by fons

On Intel architectures the precision of long doubleis 80bits.

在英特尔架构上的精度long double是80bits。

What kind of values do you want to represent? Maybe you are better off using fixed precision.

你想代表什么样的价值观？也许你最好使用固定精度。

C++ 比 double 更精确的浮点数据类型？

提问by Cool_Coder

回答by Potatoswatter

回答by Telgin

回答by ogni42

回答by fons

相关推荐

最近更新

标签

C++ 比 double 更精确的浮点数据类型？

提问by Cool_Coder

回答by Potatoswatter

回答by Telgin

回答by ogni42

回答by fons

相关推荐

C++ size_t 的正确 printf 格式说明符：%zu 或 %Iu？

C++ codecvt 不是 std 标头吗？

C++ 用于获取操作系统语言（本地化信息）的 Win32 API？

C++ QFile 不会打开文件

相关推荐

最近更新

标签