性能权衡 - MATLAB 何时比 C/C++ 更好/更慢
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20513071/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Performance Tradeoff - When is MATLAB better/slower than C/C++
提问by Loves Probability
I am aware that C/C++ is a lower-level language and generates relatively optimized machine code when we compare with any other high-level language. But I guess there is pretty much more than that, which is also evident from the practice.
我知道 C/C++ 是一种低级语言,当我们与任何其他高级语言进行比较时,它会生成相对优化的机器代码。但我想还有远不止这些,这从实践中也很明显。
When I do simple calculations like montecarlo averaging of a Gaussian sample collection or so, I see there is not much of a difference between a C++ implementation or MATLAB implementation, sometimes in fact MATLAB performs a bit better in time.
当我做简单的计算,比如对高斯样本集合进行蒙特卡洛平均等时,我发现 C++ 实现或 MATLAB 实现之间没有太大区别,有时实际上 MATLAB 在时间上的表现要好一些。
When I move on to larger scale simulations with thousands of lines of code, slowly the real picture shows up. C++ simulations show superior performance like 100x better in time complexity than an equivalent MATLAB implementation.
当我使用数千行代码进行更大规模的模拟时,慢慢地真实的画面出现了。C++ 仿真显示出卓越的性能,在时间复杂度上比等效的 MATLAB 实现高 100 倍。
The code in C++ most of the times, is pretty much serial and no hi-fi optimization is done explicitly. Whereas, as per my awareness, MATLAB inherently does a lot of optimization. This shows up for example when I try to generate a huge chunk of random samples, where as the equivalent in C++ using some library like IT++/GSL/Boost performs relatively slower (the algorithm used is the same namely mt19937).
大多数情况下,C++ 中的代码几乎是串行的,并且没有明确进行高保真优化。然而,据我所知,MATLAB 本质上做了很多优化。例如,当我尝试生成大量随机样本时,就会出现这种情况,而在 C++ 中使用某些库(如 IT++/GSL/Boost)的等效项的执行速度相对较慢(使用的算法相同,即 mt19937)。
My question is simply to know if there is a simpler tradeoff between MATLAB/C++ in performance. Is it just like what people say, "Whenever you can, C/C++ is the better"(The frequently experienced)?. In a different perspective, "What is MATLAB good for, other than comfort?"
我的问题只是想知道 MATLAB/C++ 在性能方面是否有更简单的权衡。是不是就像人们所说的“无论何时,C/C++ 都是更好的”(经常经历的)?。换个角度看,“除了舒适,MATLAB 还有什么好处?”
By the way, I don't see coding efficiency parameter being significant here, thinking of the same programmer in both cases. And also, I think the other alternatives like python,R are not relevant here. But dependence on the specific libraries we use should be interesting.
顺便说一句,我认为编码效率参数在这里并不重要,在这两种情况下都考虑同一个程序员。而且,我认为其他替代品,如 python,R 在这里不相关。但是对我们使用的特定库的依赖应该很有趣。
[I am a phd student in Coding Theory in communication systems. I do simulations using matlab/C++ all the time, and have reasonable experience of coding few 10K's of lines in both cases]
[我是通信系统编码理论的博士生。我一直使用 matlab/C++ 进行模拟,并且在这两种情况下都具有编写几万行代码的合理经验]
回答by PhD AP EcE
I have been using Matlab and C++ for about 10 years. For every numerical algorithms implemented for my research, I always start from prototyping with Matlab and then translate the project to C++ to gain a 10x to 100x (I am not kidding) performance improvement. Of course, I am comparing optimized C++ code to the fully vectorized Matlab code. On average, the improvement is about 50x.
我已经使用 Matlab 和 C++ 大约 10 年了。对于为我的研究实施的每个数值算法,我总是从使用 Matlab 进行原型设计开始,然后将项目转换为 C++ 以获得 10 到 100 倍(我不是在开玩笑)的性能改进。当然,我将优化的 C++ 代码与完全矢量化的 Matlab 代码进行比较。平均而言,改进约为 50 倍。
There are lot of subtleties behind both of the two programming languages, and the following are some misunderstandings:
这两种编程语言背后都有很多微妙之处,以下是一些误解:
Matlab is a script language but C++ is compiled
Matlab uses JIT compiler to translate your script to machine code, you can improve your speed at most by a factor 1.5 to 2 by using the compiler that Matlab provides.
Matlab code might be able to get fully vectorized but you have to optimize your code by hand in C++
Fully vectorized Matlab code can call libraries written in C++/C/Assembly (for example Intel MKL). But plain C++ code can be reasonably vectorized by modern compilers.
Toolboxes and routines that Matlab provides should be very well tuned and should have reasonable performance
No. Other than linear algebra routines, the performance is generally bad.
Matlab 是一种脚本语言,但 C++ 是编译的
Matlab 使用 JIT 编译器将您的脚本转换为机器代码,使用 Matlab 提供的编译器,您最多可以将速度提高 1.5 到 2 倍。
Matlab 代码可能能够完全矢量化,但您必须在 C++ 中手动优化代码
完全矢量化的 Matlab 代码可以调用用 C++/C/Assembly 编写的库(例如 Intel MKL)。但是普通的 C++ 代码可以被现代编译器合理地向量化。
Matlab 提供的工具箱和例程应该很好地调整并且应该具有合理的性能
不。除了线性代数例程外,性能通常很差。
The reasons why you can gain 10x~100x performance in C++ comparing to vectorized Matlab code:
与矢量化的 Matlab 代码相比,在 C++ 中可以获得 10 倍~100 倍的性能的原因:
Calling external libraries (MKL) in Matlab costs time.
Memory in Matlab is dynamically allocated and freed. For example, small matrices multiplication:
A = B*C + D*E + F*G
requires Matlab to create 2 temporary matrices. And in C++, if you allocate your memory before hand, you create NONE. And now imagine you loop that statement for 1000 times. Another solution in C++ is provided by C++11 Rvalue reference. This is the one of the biggest improvement in C++, now C++ code can be as fast as plain C code.If you want to do parallel processing, Matlab model is multi-process and the C++ way is multi-thread. If you have many small tasks needing to be parallelized, C++ provides linear gain up to many threads but you might have negative performance gain in Matlab.
Vectorization in C++ involves using intrinsics/assembly, and sometimes SIMD vectorization is only possible in C++.
In C++, it is possible for an experienced programmer to completely avoid L2 cache miss and even L1 cache miss, hence pushing CPU to its theoretical throughput limit. Performance of Matlab can lag behind C++ by a factor of 10x due to this reason alone.
In C++, computational intensive instructions sometimes can be grouped according to their latencies (code carefully in assembly or intrinsics) and dependencies (most of time is done automatically by compiler or CPU hardware), such that theoretical IPC (instructions per clock cycle) could be reached and CPU pipelines are filled.
在 Matlab 中调用外部库 (MKL) 需要花费时间。
Matlab 中的内存是动态分配和释放的。例如小矩阵乘法:
A = B*C + D*E + F*G
需要Matlab创建2个临时矩阵。而在 C++ 中,如果你事先分配内存,你就创建了 NONE。现在想象你将该语句循环了 1000 次。C++ 中的另一个解决方案由 C++11 Rvalue 参考提供。这是 C++ 最大的改进之一,现在 C++ 代码可以和普通 C 代码一样快。如果要做并行处理,Matlab模型是多进程的,C++方式是多线程的。如果您有许多需要并行化的小任务,C++ 可以为许多线程提供线性增益,但在 Matlab 中您可能会获得负面的性能增益。
C++ 中的向量化涉及使用内在函数/汇编,有时 SIMD 向量化只能在 C++ 中实现。
在 C++ 中,有经验的程序员可以完全避免 L2 缓存未命中甚至 L1 缓存未命中,从而将 CPU 推到其理论吞吐量极限。仅由于这个原因,Matlab 的性能可能落后 C++ 10 倍。
在 C++ 中,计算密集型指令有时可以根据它们的延迟(在汇编或内部函数中仔细编码)和依赖性(大部分时间由编译器或 CPU 硬件自动完成)进行分组,这样理论上的 IPC(每个时钟周期的指令)可以到达并填充 CPU 管道。
However, development time in C++ is also a factor of 10x comparing to Matlab!
然而,与 Matlab 相比,C++ 的开发时间也是 10 倍!
The reasons why you should use Matlab instead of C++:
您应该使用 Matlab 而不是 C++ 的原因:
Data visualization. I think my career can go on without C++ but I won't be able to survive without Matlab just because it can generate beautiful plots!
Low efficiency but mathematically robust build-in routines and toolboxes. Get the correct answer first and then talk about efficiency. People can make subtle mistakes in C++ (for example implicitly convert doubleto int) and get sort of correct results.
Express your ideas and present your code to your colleagues. Matlab code is much easier to read and much shorter than C++, and Matlab code can be correctly executed without compiler. I just refuse to read other people's C++ code. I don't even use C++ GNU scientific libraries because the code quality is not guaranteed. It is dangerous for a researcher/engineer to use a C++ library as a black box and take the accuracy as granted. Even for commercial C/C++ libraries, I remember Intel compiler had a signerror in its sin()function last year and numerical accuracy problems also occurred in MKL.
Debugging Matlab script with interactive console and workspace is a lot more efficient than C++ debugger. Finding an index calculation bug in Matlab could be done within minutes, but it could take hours in C++ figuring out why the program crashes randomly if boundary check is removed for the sake of speed.
数据可视化。我认为我的职业生涯可以没有 C++ 继续下去,但没有 Matlab 我将无法生存,因为它可以生成漂亮的情节!
低效率但数学上强大的内置例程和工具箱。先得到正确答案,再谈效率。人们可能会在 C++ 中犯一些细微的错误(例如将double隐式转换为int)并获得某种正确的结果。
表达您的想法并向您的同事展示您的代码。Matlab 代码比 C++ 更容易阅读,也更短,而且 Matlab 代码无需编译器也能正确执行。我只是拒绝阅读其他人的 C++ 代码。我什至不使用 C++ GNU 科学库,因为无法保证代码质量。对于研究人员/工程师来说,将 C++ 库用作黑匣子并将准确性视为理所当然是危险的。即使对于商业 C/C++ 库,我记得英特尔编译器去年在其sin()函数中出现了符号错误,并且在 MKL 中也出现了数值精度问题。
使用交互式控制台和工作区调试 Matlab 脚本比 C++ 调试器高效得多。在 Matlab 中查找索引计算错误可以在几分钟内完成,但在 C++ 中可能需要数小时才能弄清楚如果为了速度而删除边界检查,程序会随机崩溃的原因。
Last but not the least:
最后但是同样重要的:
Because once Matlab code is vectorized, there is not much left for a programmer to optimize, Matlab code performance is much less sensitive to the quality of the code comparing with C++ code. Therefore it is best to optimize computation algorithms in Matlab, and marginally better algorithms normally have marginally better performance in Matlab. On the other hand, algorithm test in C++ requires decent programmer to write algorithms optimized more or less in the same way, and to make sure the compiler does not optimize the algorithms differently.
因为一旦 Matlab 代码被矢量化,程序员就没有太多可以优化的余地,所以与 C++ 代码相比,Matlab 代码性能对代码质量的敏感度要低得多。因此最好在 Matlab 中优化计算算法,稍微好一点的算法通常在 Matlab 中的性能稍微好一点。另一方面,C++ 中的算法测试需要体面的程序员以相同的方式编写或多或少优化的算法,并确保编译器不会以不同的方式优化算法。
My recent experience in C++ and Matlab:
我最近在 C++ 和 Matlab 方面的经验:
I made several large Matlab data analysis tools in the past year and suffered from the slow speed of Matlab. But I was able to improve my Matlab program speed by 10x through the following techniques:
去年做了几个大型的Matlab数据分析工具,苦于Matlab速度慢。但是我能够通过以下技术将我的 Matlab 程序速度提高 10 倍:
Run/profile the Matlab script, re-implement critical routines in C/C++ and compile with MEX. Critical routines are mostly likely logically simple but numerically heavy. This improves speed by 5x.
Simplify ".m" files shipped with Matlab tool boxes by commenting all unnecessary safety checks and output parameter computations. Please be reminded that the modified code cannot be distributed with the rest of the user scripts. This improves speed by another 2x (after C/C++ and MEX).
运行/分析 Matlab 脚本,在 C/C++ 中重新实现关键例程并使用 MEX 进行编译。关键例程很可能在逻辑上很简单,但在数字上很重。这将速度提高了 5 倍。
通过注释所有不必要的安全检查和输出参数计算来简化 Matlab 工具箱附带的“.m”文件。请注意,修改后的代码不能与其他用户脚本一起分发。这将速度再提高了 2 倍(在 C/C++ 和 MEX 之后)。
The improved code is ~98% in Matlab and ~2% in C++.
改进后的代码在 Matlab 中约为 98%,在 C++ 中约为 2%。
I believe it is possible to improve the speed by another 2x (total 20x) if the entire tool is coded in C++, this is ~100x speed improvement of the computation routines. The hard drive I/O will then dominate the program run time.
我相信如果整个工具用 C++ 编码,速度可以再提高 2 倍(总共 20 倍),这是计算例程的约 100 倍速度提高。然后硬盘驱动器 I/O 将主导程序运行时间。
Question for Mathworks engineers:
Mathworks 工程师的问题:
When Matlab code is fully vectorized, one of the performance limiting factor is the matrix indexing operation. For instance, a finite difference operation needs to be performed on Matrix A which has a dimension of 5000x5000:
当 Matlab 代码完全矢量化时,性能限制因素之一是矩阵索引操作。例如,需要对维度为 5000x5000 的矩阵 A 进行有限差分运算:
B = A(:,2:end)-A(:,1:end-1)
The matrix indexing operation makes the Matlab code multiple times slower than the C++ code. Can the matrix indexing performance be improved?
矩阵索引操作使 Matlab 代码比 C++ 代码慢很多倍。能否提高矩阵索引性能?
回答by DCS
In my experience (several years of Computer Vision and image processing in both languages) there is no simple answer to this question, as Matlab performance depends strongly (and much more than C++ performance) on your coding style.
根据我的经验(多年的计算机视觉和两种语言图像处理经验),这个问题没有简单的答案,因为 Matlab 性能在很大程度上取决于(并且远不止 C++ 性能)您的编码风格。
Generally, Matlab wraps the classic C++ / Fortran based linear algebra libraries. So anything like x = A\b
is going to be very fast. Also, Matlab does a good job in choosing the most efficient solver for these types of problems, so for x = A\b
Matlab will look at the size of your matrices and chose the appropriate low-level routines.
通常,Matlab 包装了经典的基于 C++/Fortran 的线性代数库。所以任何类似x = A\b
的事情都会非常快。此外,Matlab 在为这些类型的问题选择最有效的求解器方面做得很好,因此x = A\b
Matlab 将查看矩阵的大小并选择适当的低级例程。
Matlab also shines in data manipulation of large matrices if you "vectorize" your code, i.e. if you avoid for
loops and use index arrays or boolean arrays to access your data. This stuff is highly optimised.
如果您“矢量化”您的代码,即如果您避免for
循环并使用索引数组或布尔数组来访问您的数据,Matlab 也会在大型矩阵的数据操作中大放异彩。这个东西是高度优化的。
For other routines, some are written in Matlab code, while others point to a C/C++ implementation (e.g. the Delaunay stuff). You can check this yourself by typing edit some_routine.m
. This opens the code and you see whether it is all Matlab or just a wrapper for something compiled.
对于其他例程,一些是用 Matlab 代码编写的,而另一些则指向 C/C++ 实现(例如 Delaunay 的东西)。您可以通过键入来自己检查edit some_routine.m
。这将打开代码,您会看到它是全是 Matlab 还是只是已编译内容的包装器。
Matlab, I think, is primarily for comfort - but comfort translates to coding time and ultimately money which is why Matlab is used in the industry. Also, it is easy to learn for engineers from other fields than computer science, with little training in programming.
我认为,Matlab 主要是为了舒适——但舒适会转化为编码时间和最终的金钱,这就是 Matlab 在行业中使用的原因。此外,对于计算机科学以外的其他领域的工程师来说,它很容易学习,几乎没有编程培训。
回答by Fabio Veronese
As a PhD Student too, and a 10years long Matlab user, I'm glad to share my POV:
作为一名博士生,以及 10 年的 Matlab 用户,我很高兴分享我的 POV:
Matlab is a great tool for developing and prototyping algorithms, especially when dealing with GUIs, high level analysis (Frequency Domain, LS Optimization etc.): fast coding, powerful syntaxis (think about [],{},: etc.).
Matlab 是开发和原型设计算法的绝佳工具,尤其是在处理 GUI、高级分析(频域、LS 优化等)时:快速编码、强大的语法(考虑 []、{}、: 等)。
As soon as your processing chain is more stable and defined and data dimentions grows move to C/C++.
一旦您的处理链更加稳定和定义并且数据维度增长,就会转移到 C/C++。
The main Matlab limit rises when considering its language is script-like: as long as you avoid any cicle (using arrayfun, cellfun or other matrix procedures) performances are high since the called subroutine is again in C/C++.
考虑到其语言类似于脚本时,主要的 Matlab 限制会上升:只要您避免任何循环(使用 arrayfun、cellfun 或其他矩阵程序),性能就会很高,因为被调用的子例程再次使用 C/C++。
回答by Jonas
Matlab does very well with linear algebra and array/matrix operations, since they seem to have been doing some extra optimizations on the underlying operations - if you want to beat Matlab there, you would need a similarly optimized BLAS/LAPACK library.
Matlab 在线性代数和数组/矩阵运算方面做得很好,因为它们似乎对底层运算做了一些额外的优化——如果你想在那里击败 Matlab,你需要一个类似优化的 BLAS/LAPACK 库。
As an interpreted language, Matlab loses time whenever a Matlab function is called, due to internal overhead, which traditionally meant that Matlab loops were slow. This has been alleviated somewhat in recent years thanks to significant improvement in the JIT compiler (search for "performance" questions on Matlab on SO for examples). As a consequence of the function call overhead, all Matlab functions that have not been implemented in C/C++ behind the scenes (call edit functionName
to see whether it's written in Matlab) risks being slower than a C/C++ counterpart.
作为一种解释性语言,每当调用 Matlab 函数时,Matlab 都会因为内部开销而浪费时间,这在传统上意味着 Matlab 循环很慢。近年来,由于 JIT 编译器的显着改进(在 SO 上的 Matlab 上搜索“性能”问题以获取示例),这种情况有所缓解。由于函数调用开销,所有未在幕后用 C/C++ 实现的 Matlab 函数(调用edit functionName
以查看它是否用 Matlab 编写)都有比 C/C++ 对应物慢的风险。
Finally, Matlab attempts to be user friendly, and may do "unnecessary" input checking that can take time (due to function call overhead). For example, if you know that ismember
gets sorted inputs, you can call ismembc
directly (the behind-the-scene compiled function), saving quite a bit of time.
最后,Matlab 试图使用户友好,并且可能会进行“不必要的”输入检查,这可能需要时间(由于函数调用开销)。例如,如果你知道ismember
得到排序的输入,你可以ismembc
直接调用(幕后编译函数),节省相当多的时间。
回答by usr1234567
Your question is difficult to answer. In general C++ is faster, but if make use of the well written algorithms of Matlab it can outperform C++. In some cases Matlab can parallelize your code which has to be done manually in many cases for C++. Mathlab can kind of export C++ code.
你的问题很难回答。一般来说,C++ 速度更快,但如果利用 Matlab 编写好的算法,它可以胜过 C++。在某些情况下,Matlab 可以并行化您的代码,这在许多情况下对于 C++ 必须手动完成。Mathlab 可以导出 C++ 代码。
So my conclusion is, that you have to measure the performance of both programs to get an answer. But then you compare your two implementations and not Matlab and C++ in general.
所以我的结论是,你必须衡量两个程序的性能才能得到答案。但是然后你比较你的两个实现,而不是一般的 Matlab 和 C++。
回答by Ray
I think you can consider the difference in four folds at least.
我认为您至少可以考虑四倍的差异。
- Compiled vs Interpreted
- Strongly-typed vs Dynamically-typed
- Performance vs Fast-prototyping
- Special strength
- 编译与解释
- 强类型 vs 动态类型
- 性能与快速原型
- 特强
For 1-3 can be easily generalized into comparison between two family of programming languages.
对于 1-3 可以很容易地概括为两个编程语言家族之间的比较。
For 4, MATLAB
is optimized for matrix operations. So if you can vectorize more code in MATLAB
, the performance can be drastically boosted. Conversely, if many loops
are required, never hesitate to use C++
or create a mex
file.
对于 4,MATLAB
针对矩阵运算进行了优化。因此,如果您可以在MATLAB
中对更多代码进行矢量化,则性能可以大大提高。相反,如果loops
需要多个文件,请毫不犹豫地使用C++
或创建mex
文件。
It is a difficult quesion after all.
毕竟这是一个很难的问题。
回答by AndyZe
I saw a 5.5x speed improvement when switching from MATLAB to C++. This was for a robot controller- lots of loops and ode solving. I spent many hours trying to optimize the MATLAB code, hardly any time optimizing the C++ (I'm sure it could have been 10x faster with a little more effort).
从 MATLAB 切换到 C++ 时,我看到速度提高了 5.5 倍。这是针对机器人控制器的 - 大量循环和 ode 求解。我花了很多时间来优化 MATLAB 代码,几乎没有时间优化 C++(我相信只要多做一点努力,它就可以快 10 倍)。
However, it was easy to add a GUI for the MATLAB code, so I still use it more often. Like others have said, it was nice to prototype first on MATLAB. That made the implementation on C++ much simpler.
但是,为 MATLAB 代码添加 GUI 很容易,所以我仍然更频繁地使用它。正如其他人所说,首先在 MATLAB 上进行原型设计很好。这使得在 C++ 上的实现更加简单。
回答by Alex Granit
Some Matlab code uses standard linear algebra fictions with multithreading built into it. So, it appears that they are faster than a sequential C code.
一些 Matlab 代码使用内置多线程的标准线性代数小说。因此,它们似乎比顺序 C 代码更快。
回答by mtall
Besides the speed of the final program, you should also take into account the total development time of your code, ie., not only the time to write, but also to debug, etc. Matlab (and its open-source counterpart, Octave) can be good for quick prototyping due to its visualisation capabilities.
除了最终程序的速度之外,您还应该考虑代码的总开发时间,即不仅是编写时间,还包括调试时间等。 Matlab(及其开源对应物Octave)由于其可视化功能,它适用于快速原型制作。
If you're using straight C++ (ie. no matrix libraries), it may take you much longer to write C++ code that's equivalent to Matlab code (eg. there might be no point in spending 10 hours writing C++ code that only runs 10 seconds quicker, compared to a Matlab program that took 5 minutes to write).
如果您使用的是直接的 C++(即没有矩阵库),则编写与 Matlab 代码等效的 C++ 代码可能需要更长的时间(例如,花费 10 个小时编写仅运行 10 秒的 C++ 代码可能毫无意义)更快,与需要 5 分钟编写的 Matlab 程序相比)。
However, there are dedicated C++ matrix libraries, such as Armadillo, which provide a Matlab-like API. This can be useful for writing performance critical code that can be called from Matlab, or for converting Matlab code into "real" programs.
但是,有专用的 C++ 矩阵库,例如Armadillo,它们提供了类似 Matlab 的 API。这对于编写可从 Matlab 调用的性能关键代码或将 Matlab 代码转换为“真实”程序非常有用。