C++ 64 位应用程序和内联汇编
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/6166437/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
64bit Applications and Inline Assembly
提问by JavaMan
I am using Visual C++ 2010 developing 32bit windows applications. There is something I really want to use inline assembly. But I just realized that visual C++ does not support inline assembly in 64bit applications. So porting to 64bit in the future is a big issue.
我正在使用 Visual C++ 2010 开发 32 位 Windows 应用程序。我真的很想使用内联汇编。但我刚刚意识到 Visual C++ 不支持 64 位应用程序中的内联汇编。所以以后移植到64位是个大问题。
I have no idea how 64bit applications are different from 32bit applications. Is there a chance that 32bit applications will ALL have to be upgraded to 64bit in the future? I heard that 64bit CPUs have more registers. Since performance is not a concern for my applications, using these extra registers is not a concern to me. Are there any other reasons that a 32bit application needs to be upgraded to 64bit? Would a 64 bit application process things differently when compared with a 32bit application, apart from that the 64bit applications may use registers or instructions that are unique to 64bit CPUs?
我不知道 64 位应用程序与 32 位应用程序有何不同。将来有没有可能将 32 位应用程序全部升级到 64 位?我听说 64 位 CPU 有更多的寄存器。由于我的应用程序不关心性能,因此我不关心使用这些额外的寄存器。32 位应用程序需要升级到 64 位是否还有其他原因?除了 64 位应用程序可能使用 64 位 CPU 独有的寄存器或指令之外,64 位应用程序与 32 位应用程序相比是否会以不同的方式处理事情?
My application needs to interact with other OS components e.g. drivers, which i know must be 64bit in 64bit windows. Would my 32bit application compatible with them?
我的应用程序需要与其他操作系统组件交互,例如驱动程序,我知道在 64 位窗口中必须是 64 位。我的 32 位应用程序是否与它们兼容?
采纳答案by Billy ONeal
Visual C++ does not support inline assembly for x64 (or ARM) processors, because generally using inline assembly is a bad idea.
Visual C++不支持 x64(或 ARM)处理器的内联汇编,因为通常使用内联汇编是一个坏主意。
- Usually compilers produce better assembly than humans.
- Even if you can produce better assembly than the compiler, using inline assembly generally defeats code optimizers of any type. Sure, your bit of hand optimized code might be faster, but the fact that code around it can't be optimized will generally lead to a slower overall program.
- Compiler intrinsicsare available from pretty much every major compiler that let you access advanced CPU features (e.g. SSE) in a manner that's consistent with the C and C++ languages, and does not defeat the optimizer.
- 通常编译器产生比人类更好的汇编。
- 即使您可以生成比编译器更好的汇编,使用内联汇编通常也会打败任何类型的代码优化器。当然,您手动优化的代码可能会更快,但是围绕它的代码无法优化的事实通常会导致整个程序变慢。
- 几乎所有主要编译器都提供编译器内在函数,这些编译器允许您以与 C 和 C++ 语言一致的方式访问高级 CPU 功能(例如 SSE),并且不会打败优化器。
I am wondering would there be a chance that 32bit applications will ALL have to be upgraded to 64bit in the future.
我想知道将来是否有可能将 32 位应用程序全部升级到 64 位。
That depends on your target audience. If you're targeting servers, then yes, it's reasonable to allow users to not install the WOW64 subsystem because it's a server -- you know it'll probably not be running too much 32 bit code. I believe Windows Server 2008 R2 already allows this as an option if you install it as a "server core" instance.
这取决于您的目标受众。如果你的目标是服务器,那么是的,允许用户不安装 WOW64 子系统是合理的,因为它是一个服务器——你知道它可能不会运行太多的 32 位代码。如果您将其安装为“服务器核心”实例,我相信 Windows Server 2008 R2 已经允许将其作为一个选项。
Since performance is not a concern for my appli so using the extra 64bit registers is not a concern to me. Is there any other reasons that a 32bit appli has to be upgraded to 64bit in the future?
由于我的应用程序不关心性能,因此我不关心使用额外的 64 位寄存器。将来是否有其他原因必须将 32 位应用程序升级到 64 位?
64 bit has nothing to do with registers. It has to do with size of addressable virtual memory.
64 位与寄存器无关。它与可寻址虚拟内存的大小有关。
Would a 64 bit app process different from a 32bit appl process apart from that the 64bit appli is using some registers/instructions that is unique to 64bit CPUs?
除了 64 位应用程序使用一些 64 位 CPU 独有的寄存器/指令之外,64 位应用程序进程是否与 32 位应用程序进程不同?
Most likely. 32 bit applications are constrained in that they can't map things more than ~2GB into memory at once. 64 bit applications don't have that problem. Even if they're not using more than 4GB of physical memory, being able to address more than 4GB of virtual memory is helpful for mapping files on disk into memory and similar.
最有可能的。32 位应用程序受到限制,因为它们不能一次将超过约 2GB 的内容映射到内存中。64 位应用程序没有这个问题。即使他们不使用超过 4GB 的物理内存,能够寻址超过 4GB 的虚拟内存也有助于将磁盘上的文件映射到内存等。
My application needs to interact with other OS components e.g. drivers, which i know must be 64bit in 64bit windows. Would my 32bit application compatible with them?
我的应用程序需要与其他操作系统组件交互,例如驱动程序,我知道在 64 位窗口中必须是 64 位。我的 32 位应用程序是否与它们兼容?
That depends entirely on how you're communicating with those drivers. If it's through something like a "named file interface" then your app could stay as 32 bit. If you try to do something like shared memory (Yikes! Shared memory accessible from user mode with a driver?!?) then you're going to have to build your app as 64 bit.
这完全取决于您与这些驱动程序的沟通方式。如果它是通过“命名文件接口”之类的东西,那么您的应用程序可以保持为 32 位。如果您尝试执行诸如共享内存之类的操作(哎呀!可以使用驱动程序从用户模式访问共享内存?!?)那么您将不得不将您的应用程序构建为 64 位。
回答by Necrolis
Apart form @Billy's great write up, if you really feel the need to use inline 64bit assembly, then you can use an external assembler like MASM to get that done, see this. (its also possible to speed this up with prebuild scripts).
除了@Billy 的精彩文章之外,如果您真的觉得需要使用内联 64 位汇编,那么您可以使用像 MASM 这样的外部汇编器来完成这项工作,请参阅此。(也可以使用预构建脚本加快速度)。
回答by Silvio
the Intel C Compiler 15 has inline capability in 64bit too. And you could integrate the IC in Visual Studio as a toolset: then you'd have VC++ 64bit with inline assembly. One catch though -its expensive cheers
英特尔 C 编译器 15 也具有 64 位内联功能。您可以将 IC 作为工具集集成到 Visual Studio 中:然后您将拥有 VC++ 64 位和内联汇编。一个问题 - 它昂贵的欢呼声
回答by CodeLurker
While we're at it, MinGW also has 64-bit inline assembly language; and it's pretty fast, and free. It used to be slow on some math; so I'd start out comparing performances of MSVC vs. MinGW to see if its a decent starting place for your application.
当我们在做的时候,MinGW 也有 64 位内联汇编语言;它非常快,而且免费。它曾经在某些数学上很慢;所以我会开始比较 MSVC 与 MinGW 的性能,看看它是否适合您的应用程序。
Also, if inline assembly is supposed to slow down surrounding code; it seems to me that while that might be true for many short segments:
此外,如果内联汇编应该减慢周围代码的速度;在我看来,虽然这对于许多短片来说可能是正确的:
- Actually, humans very often do code assembly that runs more efficiently than compilers - or at least that was always the common wisdom when I was learning programming in the 70's and 80's and continued to be the case through ~2000.
- Depending on the time spent in the loops and amount of code; a hand-written assembly routine could speed a routine up so much that performance lost to optimization might be relatively small; or none - as would be the case in converting an entire function to assembly.
- 实际上,人类经常进行比编译器更有效运行的代码汇编 - 或者至少当我在 70 年代和 80 年代学习编程时,这一直是常识,并且一直持续到 2000 年左右。
- 取决于在循环中花费的时间和代码量;手写汇编例程可以大大加快例程的速度,以致于优化所损失的性能可能相对较小;或无 - 就像将整个函数转换为程序集的情况一样。
Assembly very much can have a place in code that needs high optimization, no matter what M$ says. You won't really know if assembly will or won't speed up code until you try it. Everything else is just pontificating.
无论 M$ 怎么说,汇编都可以在需要高度优化的代码中占有一席之地。在您尝试之前,您不会真正知道程序集是否会加速代码。其他一切都只是自以为是。
I favor the approach of compiling c++ code into assembly, and then hand-optimizing THAT. It saves you the trouble of writing much of it; and with a little experimentation, you can utilize the compiler's best optimizations; and then begin improving on that. FWIW, I've never needed to with a modern program. Often, other things can speed it up just as much or more - e.g. such as multi-threading, using look-up tables, moving time-expensive operations out of loops, etc. However, for performance-critical applications, I see no reason not to try; and just use it if it works. M$ is just being lazy by dropping assembly output.
我喜欢将 C++ 代码编译成程序集,然后手动优化它的方法。它可以为您省去编写大部分内容的麻烦;通过一些实验,您可以利用编译器的最佳优化;然后开始改进。FWIW,我从来不需要使用现代程序。通常,其他事情可以同样或更多地加速它——例如多线程、使用查找表、将耗时的操作移出循环等。但是,对于性能关键的应用程序,我认为没有任何理由不去尝试;如果它有效,就使用它。M$ 只是通过删除汇编输出而变得懒惰。
As to is 64-bit or 32-bit faster, this is similar to the situation with 16-bit vs. 32-bit. The wider bandwidth can sling huge amounts of data faster. Yet, the CPU clock on 32-bit OSs runs faster than on 64-bit ones. Thus for the same number of threads, and for more CPU intensive operations, a 32-bit app on a 32-bit OS will be faster. However, the difference isn't much; and 64-bit instructions can really make a difference. However, a given user will only have one OS installed; and so the 64-bit app will be either faster for that OS; or the same speed. It will be a larger download, however. You might as well go for the possibly faster speed with 64-bits.
至于是 64 位还是 32 位更快,这类似于 16 位与 32 位的情况。更宽的带宽可以更快地传输大量数据。然而,32 位操作系统上的 CPU 时钟比 64 位操作系统运行得更快。因此,对于相同数量的线程,以及更多 CPU 密集型操作,32 位操作系统上的 32 位应用程序会更快。但是,差别不大;和 64 位指令确实可以发挥作用。但是,给定的用户只会安装一个操作系统;因此,该操作系统的 64 位应用程序将更快;或同样的速度。但是,这将是一个更大的下载。您不妨选择 64 位可能更快的速度。
Also, note that I benchmarked a 64-bit and a 32-bit app on OSs of the respective sizes; using the respective versions of MinGW. It did a lot of 64-bit floating point number crunching, and I was sure the 64-bit version would have the edge. It didn't!! My guess is that the floating point registers in the built-in math coprocessor run in equal numbers of clock cycles on both OSs, and perhaps the 64-bit version ran slightly faster. My benchmarks were so close in both versions, that one was not clearly faster. Perhaps long number-crunching operations were slower on 64-bit, but the 64-bit control code ran a little faster - causing nearly equal results.
另外,请注意,我在相应大小的操作系统上对 64 位和 32 位应用程序进行了基准测试;使用相应版本的 MinGW。它做了很多 64 位浮点数运算,我确信 64 位版本会有优势。它没有!!我的猜测是内置数学协处理器中的浮点寄存器在两个操作系统上以相同数量的时钟周期运行,而且 64 位版本的运行速度可能稍快一些。我的基准在两个版本中都非常接近,以至于一个明显更快。或许长时间的数字运算在 64 位上更慢,但 64 位控制代码运行得更快——导致几乎相同的结果。
Basically, the only time 32-bits makes sense, IMHO, is when you think you might have an in-house app that would run faster on it; or when you are delivering to users on 32-bit OS machines (many developers still offer both versions).
基本上,恕我直言,唯一一次 32 位有意义是当您认为您可能拥有一个可以在其上运行得更快的内部应用程序时;或者当您在 32 位操作系统机器上向用户交付时(许多开发人员仍然提供这两个版本)。