如何:C++ 中的内联汇编器(在 Visual Studio 2010 下)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2839710/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to: Inline assembler in C++ (under Visual Studio 2010)
提问by toxic shock
I'm writing a performance-critical, number-crunching C++ project where 70% of the time is used by the 200 line core module.
我正在编写一个性能关键、数字运算处理的 C++ 项目,其中 70% 的时间由 200 行核心模块使用。
I'd like to optimize the core using inline assembly, but I'm completely new to this. I do, however, know some x86 assembly languages including the one used by GCC and NASM.
我想使用内联汇编优化核心,但我对此完全陌生。但是,我知道一些 x86 汇编语言,包括 GCC 和 NASM 使用的汇编语言。
All I know:
我所知道的:
I have to put the assembler instructions in _asm{}
where I want them to be.
我必须把汇编指令放在_asm{}
我想要的地方。
Problem:
问题:
- I have no clue where to start. What is in which register at the moment my inline assembly comes into play?
- 我不知道从哪里开始。当我的内联汇编开始发挥作用时,哪个寄存器在哪个寄存器中?
回答by ThiefMaster
You can access variables by their name and copy them to registers. Here's an example from MSDN:
您可以按名称访问变量并将它们复制到寄存器。这是 MSDN 上的一个例子:
int power2( int num, int power )
{
__asm
{
mov eax, num ; Get first argument
mov ecx, power ; Get second argument
shl eax, cl ; EAX = EAX * ( 2 to the power of CL )
}
// Return with result in EAX
}
Using C or C++ in ASM blocksmight be also interesting for you.
在 ASM 块中使用 C 或 C++对您来说可能也很有趣。
回答by Goz
The microsoft compiler is very poor at optimisations when inline assembly gets involved. It has to back up registers because if you use eax then it won't move eax to another free register it will continue using eax. The GCC assembler is far more advanced on this front.
当涉及内联汇编时,微软编译器在优化方面非常差。它必须备份寄存器,因为如果您使用 eax,那么它不会将 eax 移动到另一个空闲寄存器,它将继续使用 eax。GCC 汇编器在这方面要先进得多。
To get round this microsoft started offering intrinsics. These are a far better way to do your optimisation as it allows the compiler to work with you. As Chris mentioned inline assembly doesn't work under x64 with the MS compiler as well so on that platform you REALLY are better off just using the intrinsics.
为了解决这个问题,微软开始提供内在函数。这些是进行优化的更好方法,因为它允许编译器与您一起工作。正如 Chris 提到的,内联汇编在 x64 下也不适用于 MS 编译器,因此在该平台上,您最好只使用内在函数。
They are easy to use and give good performance. I will admit I am often able to squeeze a few more cycles out of it by using an external assembler but they're bloody good for the productivity improvement they provide
它们易于使用并提供良好的性能。我承认我经常能够通过使用外部汇编程序从中挤出更多的周期,但它们对提高生产力非常有用
回答by Chris Becke
Nothing is in the registers. as the _asm block is executed. You need to move stuff into the registers. If there is a variable: 'a', then you would need to
寄存器中没有任何内容。当 _asm 块被执行时。你需要把东西移到寄存器中。如果有一个变量:'a',那么你需要
__asm {
mov eax, [a]
}
It is worth pointing out that VS2010 comes with Microsofts assembler. Right click on a project, go to build rules and turn on the assembler build rules and the IDE will then process .asm files.
值得指出的是,VS2010 自带微软的汇编器。右键单击一个项目,转到构建规则并打开汇编程序构建规则,然后 IDE 将处理 .asm 文件。
this is a somewhat better solution as VS2010 supports 32bit AND 64bit projects and the __asm keyword does NOT work in 64bit builds. You MUST use external assembler for 64bit code :/
这是一个更好的解决方案,因为 VS2010 支持 32 位和 64 位项目,而 __asm 关键字在 64 位版本中不起作用。您必须对 64 位代码使用外部汇编程序:/
回答by Thomas Matthews
I prefer writing entire functions in assembly rather than using inline
assembly. This allows you to swap out the high level language function with the assembly one during the build process. Also, you don't have to worry about compiler optimizations getting in the way.
我更喜欢在汇编中编写整个函数而不是使用inline
汇编。这允许您在构建过程中用汇编语言替换高级语言函数。此外,您不必担心编译器优化会妨碍您。
Before you write a single line of assembly, print out the assembly language listing for your function. This gives you a foundation to build upon or modify. Another helpful tool is the interweaving of assembly with source code. This will tell you how the compiler is coding specific statements.
在编写一行汇编代码之前,打印出函数的汇编语言列表。这为您提供了构建或修改的基础。另一个有用的工具是汇编与源代码的交织。这将告诉您编译器如何编码特定语句。
If you need to insert inline assembly for a large function, make a new function for the code that you need to inline. Again replace with C++ or assembly during build time.
如果需要为大型函数插入内联汇编,请为需要内联的代码创建一个新函数。在构建期间再次替换为 C++ 或程序集。
These are my suggestions, Your Mileage May Vary (YMMV).
这些是我的建议,您的里程可能会有所不同(YMMV)。
回答by egrunin
I really like assembly, so I'm not going to be a nay-sayer here. It appears that you've profiled your code and found the 'hotspot', which is the correct way to start. I also assume that the 200 lines in question don't use a lot of high-level constructs like vector
.
我真的很喜欢组装,所以我不会在这里说反对者。看来您已经分析了您的代码并找到了“热点”,这是正确的开始方式。我还假设有问题的 200 行没有使用很多高级结构,如vector
.
I do have to give one bit of warning: if the number-crunching involves floating-point math, you are in for a world of pain, specifically a whole set of specialized instructions, and a college term's worth of algorithmic study.
我必须给出一点警告:如果数字运算涉及浮点数学,那么您将陷入痛苦的世界,特别是一整套专门的指令和大学学期的算法研究。
All that said: if I were you, I'd step through the code in question in the VS debugger, using the Disassembly view. If you feel comfortable reading the code as you go along, that's a good sign. After that, do a Release compile (Debug turns off optimization) and generate an ASM listing for that module. Thenif you think you see room for improvement...you have a place to start. Other people's answers have linked to the MSDN documentation, which is really pretty skimpy but still a reasonable start.
所有这一切都说:如果我是你,我会在 VS 调试器中使用反汇编视图逐步完成有问题的代码。如果您在阅读代码时感觉很舒服,这是一个好兆头。之后,执行发布编译(调试关闭优化)并为该模块生成 ASM 列表。然后,如果您认为自己有改进的空间……那么您就有了开始的地方。其他人的答案与 MSDN 文档相关联,这确实很简陋,但仍然是一个合理的开始。
回答by Paul R
Go for the low hanging fruit first...
先去摘低垂的果实……
As other have said, the Microsoft compiler is pretty poor at optimisation. You may be able to save yourself a lot of effort just by investing in a decent compiler, such as Intel's ICC, and re-compiling the code "as is". You can get a 30 day free evaluation license from Intel and try it out.
正如其他人所说,微软编译器在优化方面很差。只需投资一个不错的编译器(例如 Intel 的 ICC)并“按原样”重新编译代码,您就可以为自己节省大量精力。您可以从英特尔获得 30 天免费评估许可证并试用。
Also, if you have the option to build a 64-bit executable, then running in 64-bit mode can yield a 30% performance improvement, due to the x2 increase in number of available registers.
此外,如果您可以选择构建 64 位可执行文件,那么在 64 位模式下运行可以产生 30% 的性能提升,因为可用寄存器数量增加了 x2。