C++ double 或 float,哪个更快?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4584637/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
double or float, which is faster?
提问by coming out of void
I am reading "accelerated C++". I found one sentence which states "sometimes double
is faster in execution than float
in C++". After reading sentence I got confused about float
and double
working. Please explain this point to me.
我正在阅读“加速 C++”。我发现有一句话说“有时double
执行速度比float
C++快”。阅读句子后,我糊涂了约float
和double
工作。请向我解释这一点。
回答by foo
Depends on what the native hardware does.
取决于本机硬件的作用。
If the hardware implements double (like the x86 does), then float is emulated by extending it there, and the conversion will cost time. In this case, double will be faster.
If the hardware implements float only, then emulating double with it will cost even more time. In this case, float will be faster.
And if the hardware implements neither, and both have to be implemented in software. In this case, both will be slow, but double will be slightly slower (more load and store operations at the least).
如果硬件实现了 double(就像 x86 那样),那么 float 通过在那里扩展来模拟,并且转换将花费时间。在这种情况下,double 会更快。
如果硬件只实现 float,那么用它模拟 double 将花费更多的时间。在这种情况下,浮动会更快。
如果硬件都没有实现,两者都必须用软件实现。在这种情况下,两者都会很慢,但 double 会稍微慢一些(至少有更多的加载和存储操作)。
The quote you mention is probably referring to the x86 platform, where the first case wasgiven. But this doesn't hold true in general.
你提到的报价所指的可能是x86平台上,在第一种情况中给出。但这在一般情况下并不成立。
回答by Diego Dias
You can find a complete answer in this article:
您可以在本文中找到完整的答案:
What Every Computer Scientist Should Know About Floating-Point Arithmetic
This is a quote from a previous Stack Overflow thread, about how float
and double
variables affect memory bandwidth:
这是一个报价从之前的堆栈溢出线程,如何float
和double
变量影响内存带宽:
If a double requires more storage than a float, then it will take longer to read the data. That's the naive answer. On a modern IA32, it all depends on where the data is coming from. If it's in L1 cache, the load is negligible provided the data comes from a single cache line. If it spans more than one cache line there's a small overhead. If it's from L2, it takes a while longer, if it's in RAM then it's longer still and finally, if it's on disk it's a huge time. So the choice of float or double is less imporant than the way the data is used. If you want to do a small calculation on lots of sequential data, a small data type is preferable. Doing a lot of computation on a small data set would allow you to use bigger data types with any significant effect. If you're accessing the data very randomly, then the choice of data size is unimportant - data is loaded in pages / cache lines. So even if you only want a byte from RAM, you could get 32 bytes transfered (this is very dependant on the architecture of the system). On top of all of this, the CPU/FPU could be super-scalar (aka pipelined). So, even though a load may take several cycles, the CPU/FPU could be busy doing something else (a multiply for instance) that hides the load time to a degree
如果 double 比 float 需要更多的存储空间,那么读取数据将需要更长的时间。这是天真的答案。在现代 IA32 上,这完全取决于数据的来源。如果它在 L1 缓存中,如果数据来自单个缓存行,则负载可以忽略不计。如果它跨越多个缓存行,则开销很小。如果它来自 L2,它需要更长的时间,如果它在 RAM 中,那么它仍然更长,最后,如果它在磁盘上,它是一个巨大的时间。因此,选择 float 或 double 不如使用数据的方式重要。如果您想对大量连续数据进行小型计算,则最好使用小型数据类型。在小数据集上进行大量计算将允许您使用更大的数据类型并产生任何显着影响。如果你' 非常随机地重新访问数据,那么数据大小的选择并不重要 - 数据加载到页面/缓存行中。因此,即使您只需要 RAM 中的一个字节,也可以传输 32 个字节(这非常依赖于系统的架构)。最重要的是,CPU/FPU 可以是超标量(又名流水线)。因此,即使加载可能需要几个周期,CPU/FPU 也可能忙于做其他事情(例如乘法),在一定程度上隐藏加载时间
回答by watson1180
Short answer is: it depends.
简短的回答是:这取决于。
CPU with x87 will crunch floats and doubles equally fast. Vectorized code will run faster with floats, because SSE can crunch 4 floats or 2 doubles in one pass.
带有 x87 的 CPU 将同样快速地处理浮点数和加倍数。使用浮点数向量化代码会运行得更快,因为 SSE 可以一次处理 4 个浮点数或 2 个双精度数。
Another thing to consider is memory speed. Depending on your algorithm, your CPU could be idling a lot while waiting for the data. Memory intensive code will benefit from using floats, but ALU limited code won't (unless it is vectorized).
要考虑的另一件事是内存速度。根据您的算法,您的 CPU 在等待数据时可能会闲置很多。内存密集型代码将受益于使用浮点数,但 ALU 有限代码不会(除非它被向量化)。
回答by Peter G.
I can think of two basic cases when doubles are faster than floats:
当双打比浮动快时,我可以想到两种基本情况:
Your hardware supports double operations but not float operations, so floats will be emulated by software and therefore be slower.
You really need the precision of doubles. Now, if you use floats anyway you will have to use two floats to reach similar precision to double. The emulation of a true double with floats will be slower than using floats in the first place.
- You do not necessarily need doubles but your numeric algorithm converges faster due to the enhanced precision of doubles. Also, doubles might offer enough precision to use a faster but numerically less stable algorithm at all.
您的硬件支持双重运算但不支持浮点运算,因此浮点数将被软件模拟,因此速度较慢。
你真的需要双打的精度。现在,如果您无论如何都使用浮点数,则必须使用两个浮点数才能达到与双精度相似的精度。使用浮点数模拟真正的双精度会比首先使用浮点数慢。
- 您不一定需要双精度数,但由于双精度数的提高,您的数值算法收敛速度更快。此外,双打可能提供足够的精度来使用更快但数值不太稳定的算法。
For completeness' sake I also give some reasons for the opposite case of floats being faster. You can see for yourself whichs reasons dominate in your case:
为了完整起见,我还给出了浮点数更快的相反情况的一些原因。您可以亲眼看看哪些原因在您的案例中占主导地位:
Floats are faster than doubles when you don't need double's precision and you are memory-bandwidth bound and your hardware doesn't carry a penalty on floats.
They conserve memory-bandwidth because they occupy half the space per number.
There are also platforms that can process more floats than doubles in parallel.
当您不需要 double 的精度并且您受内存带宽限制并且您的硬件不会对浮点数造成损失时,浮点数比双精度数快。
它们节省了内存带宽,因为它们占用每个数字的一半空间。
还有一些平台可以并行处理比双打更多的浮点数。
回答by Frederik Slijkerman
On Intel, the coprocessor (nowadays integrated) will handle both equally fast, but as some others have noted, doubles result in higher memory bandwidth which can cause bottlenecks. If you're using scalar SSE instructions (default for most compilers on 64-bit), the same applies. So generally, unless you're working on a large set of data, it doesn't matter much.
在 Intel 上,协处理器(现在已集成)将处理两者的速度相同,但正如其他一些人指出的那样,双倍会导致更高的内存带宽,这可能会导致瓶颈。如果您使用标量 SSE 指令(大多数 64 位编译器的默认值),同样适用。所以一般来说,除非你正在处理大量数据,否则它并不重要。
However, parallel SSE instructions will allow four floats to be handled in one instruction, but only two doubles, so here float can be significantly faster.
但是,并行 SSE 指令将允许在一条指令中处理四个浮点数,但只能处理两个双精度数,因此这里的浮点速度可以明显更快。
回答by Akash Agrawal
In experiments of adding 3.3 for 2000000000 times, results are:
在20亿次加3.3的实验中,结果为:
Summation time in s: 2.82 summed value: 6.71089e+07 // float
Summation time in s: 2.78585 summed value: 6.6e+09 // double
Summation time in s: 2.76812 summed value: 6.6e+09 // long double
So double is faster and default in C and C++. It's more portable and the default across all C and C++ library functions. Alos double has significantly higher precision than float.
所以 double 更快,并且在 C 和 C++ 中是默认的。它更具可移植性,并且是所有 C 和 C++ 库函数的默认值。Alos double 的精度明显高于 float。
Even Stroustrup recommends double over float:
甚至 Stroustrup 也推荐双倍浮动:
"The exact meaning of single-, double-, and extended-precision is implementation-defined. Choosing the right precision for a problem where the choice matters requires significant understanding of floating-point computation. If you don't have that understanding, get advice, take the time to learn, or use double and hope for the best."
“单精度、双精度和扩展精度的确切含义是实现定义的。为选择很重要的问题选择正确的精度需要对浮点计算有深入的了解。如果您不了解,请获取建议,花点时间学习,或者使用双倍并希望最好。”
Perhaps the only case where you should use float instead of double is on 64bit hardware with a modern gcc. Because float is smaller; double is 8 bytes and float is 4 bytes.
也许您应该使用 float 而不是 double 的唯一情况是在具有现代 gcc 的 64 位硬件上。因为浮动更小;double 是 8 个字节,float 是 4 个字节。
回答by Gene Bushuyev
There is only one reason 32-bit floats can be slower than 64-bit doubles (or 80-bit 80x87). And that is alignment. Other than that, floats take less memory, generally meaning faster access, better cache performance. It also takes fewer cycles to process 32-bit instructions. And even when (co)-processor has no 32-bit instructions, it can perform them on 64-bit registers with the same speed. It probably possible to create a test case where doubles will be faster than floats, and v.v., but my measurements of real statistics algos didn't show noticeable difference.
32 位浮点数比 64 位双精度数(或 80 位 80x87)慢的原因只有一个。这就是对齐。除此之外,浮动占用更少的内存,通常意味着更快的访问,更好的缓存性能。处理 32 位指令所需的周期也更少。即使(协)处理器没有 32 位指令,它也可以在 64 位寄存器上以相同的速度执行这些指令。可能会创建一个测试用例,其中 double 将比 float 和 vv 更快,但是我对真实统计算法的测量没有显示出明显的差异。
回答by P47RICK
float is usually faster. double offers greater precision. However performance may vary in some cases if special processor extensions such as 3dNow or SSE are used.
浮动通常更快。double 提供更高的精度。但是,如果使用特殊的处理器扩展(例如 3dNow 或 SSE),性能在某些情况下可能会有所不同。