C# Excel 互操作 - 效率和性能

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/356371/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-04 00:20:23  来源:igfitidea点击:

Excel Interop - Efficiency and performance

c#excelperformanceinteropvsto

提问by Vincent Van Den Berghe

I was wondering what I could do to improve the performance of Excel automation, as it can be quite slow if you have a lot going on in the worksheet...

我想知道我可以做些什么来提高 Excel 自动化的性能,因为如果工作表中有很多事情要做,它可能会很慢......

Here's a few I found myself:

以下是我自己发现的一些:

  • ExcelApp.ScreenUpdating = false-- turn off the redrawing of the screen

  • ExcelApp.Calculation = Excel.XlCalculation.xlCalculationManual-- turning off the calculation engine so Excel doesn't automatically recalculate when a cell value changes (turn it back on after you're done)

  • Reduce calls to Worksheet.Cells.Item(row, col)and Worksheet.Range-- I had to poll hundreds of cells to find the cell I needed. Implementing some caching of cell locations, reduced the execution time from ~40 to ~5 seconds.

  • ExcelApp.ScreenUpdating = false-- 关闭屏幕重绘

  • ExcelApp.Calculation = Excel.XlCalculation.xlCalculationManual-- 关闭计算引擎,以便 Excel 在单元格值更改时不会自动重新计算(完成后重新打开)

  • 减少对Worksheet.Cells.Item(row, col)和的调用Worksheet.Range——我必须轮询数百个单元格才能找到我需要的单元格。实现一些单元格位置的缓存,将执行时间从约 40 秒减少到约 5 秒。

What kind of interop calls take a heavy toll on performance and should be avoided? What else can you do to avoid unnecessary processing being done?

什么样的互操作调用会严重影响性能并且应该避免?您还能做些什么来避免进行不必要的处理?

回答by Dirk Vollmar

Performance also depends a lot on how you automate Excel. VBA is faster than COM automation is faster than .NET automation. And typically early (compile time) binding is faster than late binding, too.

性能还很大程度上取决于您如何自动化 Excel。VBA 比 COM 自动化快,比 .NET 自动化快。并且通常早期(编译时)绑定也比后期绑定快。

If you have serious performance problems you could think of moving the critical parts of the code to a VBA module and call that code from your COM/.NET automation code.

如果您遇到严重的性能问题,您可以考虑将代码的关键部分移至 VBA 模块,并从您的 COM/.NET 自动化代码中调用该代码。

If you use .NET you should also use the optimized primary interop assemblies available from Microsoft and not use custom-built interop assemblies.

如果您使用 .NET,您还应该使用 Microsoft 提供的优化的主互操作程序集,而不是使用自定义构建的互操作程序集。

回答by Treb

Use excels builtin functionality whenever possible, for example: Instead of searching a whole column for a given string, use the findcommand available in the GUI by Ctrl-F:

尽可能使用 excels 内置功能,例如:不要在整列中搜索给定字符串,而是使用findGUI 中可用的 Ctrl-F 命令:

Set Found = Cells.Find(What:=SearchString, LookIn:=xlValues, _
    SearchOrder:=xlByRows, SearchDirection:=xlNext, _
    MatchCase:=False, SearchFormat:=False)

If Not Found Is Nothing Then
    Found.Activate
    (...)
EndIf

If you want to sort some lists, use the excel sortcommand, don't do it manually in VBA:

如果要对某些列表进行排序,请使用 excelsort命令,不要在 VBA 中手动进行:

Selection.Sort Key1:=Range("A1"), Order1:=xlAscending, Header:=xlGuess, _
    OrderCustom:=1, MatchCase:=False, Orientation:=xlTopToBottom, _
    DataOption1:=xlSortNormal

回答by Jon Fournier

If you're polling values of many cells you can get all the cell values in a range stored in a variant array in one fell swoop:

如果您要轮询多个单元格的值,则可以一举获得存储在变体数组中的范围内的所有单元格值:

Dim CellVals() as Variant
CellVals = Range("A1:B1000").Value

There is a tradeoff here, in terms of the size of the range you're getting values for. I'd guess if you need a thousand or more cell values this is probably faster than just looping through different cells and polling the values.

就您获取值的范围的大小而言,这里有一个权衡。我猜如果你需要一千个或更多的单元格值,这可能比循环遍历不同的单元格并轮询值更快。

回答by Anonymous Type

When using C# or VB.Net to either get or set a range, figure out what the total size of the range is, and then get one large 2 dimensional object array...

当使用 C# 或 VB.Net 获取或设置一个范围时,弄清楚该范围的总大小是多少,然后得到一个大的二维对象数组...

//get values
object[,] objectArray = shtName.get_Range("A1:Z100").Value2;
iFace = Convert.ToInt32(objectArray[1,1]);

//set values
object[,] objectArray = new object[3,1] {{"A"}{"B"}{"C"}};
rngName.Value2 = objectArray;

Note that its important you know what datatype Excel is storing (text or numbers) as it won't automatically do this for you when you are converting the type back from the object array. Add tests if necessary to validate the data if you can't be sure beforehand of the type of data.

请注意,了解 Excel 存储的数据类型(文本或数字)很重要,因为当您将类型从对象数组转换回时,它不会自动为您执行此操作。如果您无法事先确定数据类型,则在必要时添加测试以验证数据。

回答by Ritesh

This is for anyone wondering what the best way is to populate an excel sheet from a db result set. This is not meant to be a full list by any means but it does list a few options.

这适用于任何想知道从数据库结果集中填充 Excel 工作表的最佳方法的人。这绝不是一个完整的列表,但它确实列出了一些选项。

Some performance numbers while attempting to populate an excel sheet with 155 columns and 4200 records on an old Pentium 4 3GHz box including data retrieval time which was never more than 10 seconds in order of slowest to fastest is as follows...

尝试在旧的 Pentium 4 3GHz 机器上填充具有 155 列和 4200 条记录的 Excel 表时的一些性能数据,包括从最慢到最快的顺序从不超过 10 秒的数据检索时间如下...

  1. One cell at a time - Just under 11 minutes

  2. Populating a dataset by converting to html + Saving html to disk + Loading html into excel and saving worksheet as xls/xlsx - 5 minutes

  3. One column at a time - 4 minutes

  4. Using the deprecated sp_makewebtask procedure in SQL 2005 to create an HTML file - 9 Seconds + Followed by loading the html file in excel and saving as XLS/XLSX - About 2 minutes.

  5. Convert .Net dataset to ADO RecordSet and use the WorkSheet.Range[].CopyFromRecordset function to populate excel - 45 seconds!

  1. 一次一个单元格 -不到 11 分钟

  2. 通过转换为 html + 将 html 保存到磁盘 + 将 html 加载到 excel 并将工作表保存为 xls/xlsx 来填充数据集 - 5 分钟

  3. 一次一列 - 4 分钟

  4. 使用 SQL 2005 中已弃用的 sp_makewebtask 过程创建 HTML 文件 - 9 秒 + 然后在 excel 中加载 html 文件并保存为 XLS/XLSX -大约 2 分钟。

  5. 将 .Net 数据集转换为 ADO RecordSet 并使用 WorkSheet.Range[].CopyFromRecordset 函数来填充 excel - 45 秒!

I ended up using option 5. Hope this helps.

我最终使用了选项 5。希望这会有所帮助。

回答by Charles Williams

As Anonymous Type says: reading/writing large range blocks is very important to performance.

正如匿名类型所说:读取/写入大范围块对性能非常重要。

In cases where the COM-Interop overhead is still too large you may want to switch to using the XLL interface, which is the fastest Excel interface.

如果 COM-Interop 开销仍然太大,您可能希望切换到使用 XLL 接口,这是最快的 Excel 接口。

Although the XLL interface is primarily meant for C++ users, both XL DNA and Addin Express provide .NET to XLL bridge capability which is significantly faster than COM-Interop.

尽管 XLL 接口主要面向 C++ 用户,但 XL DNA 和 Addin Express 都提供了 .NET 到 XLL 的桥接功能,这比 COM-Interop 快得多。

回答by JamesFaix

Another big thing you can do in VBA is to use Option Explicit and avoid Variants wherever possible. Variants are not 100% avoidable in VBA, but they make the interpreter do more work at runtime and waste memory.

您可以在 VBA 中做的另一件事是使用 Option Explicit 并尽可能避免变体。VBA 中的变体并不是 100% 可以避免的,但它们会使解释器在运行时做更多的工作并浪费内存。

I found this article very helpful when I was starting with VBA in Excel.
http://www.ozgrid.com/VBA/SpeedingUpVBACode.htm

当我开始在 Excel 中使用 VBA 时,我发现这篇文章非常有帮助。
http://www.ozgrid.com/VBA/SpeedingUpVBACode.htm

And this book

还有这本书

http://www.amazon.com/VB-VBA-Nutshell-Language-OReilly/dp/1565923588

http://www.amazon.com/VB-VBA-Nutshell-Language-OReilly/dp/1565923588

Similar to

相似

 app.ScreenUpdates = false //and
 app.Calculation = xlCalculationManual

you can also set

你也可以设置

 app.EnableEvents = false //Prevent Excel events
 app.Interactive = false  //Prevent user clicks and keystrokes

although they don't seem to make as big a difference as the first two.

虽然它们似乎没有前两个那么大。

Similar to setting Range values to arrays, if you are working with data that is mostly tables with the same formula in every row of a column, you can use R1C1 formula notation for your formula and set an entire column equal to the formula string to set the whole thing in one call.

与将范围值设置为数组类似,如果您处理的数据主要是在列的每一行中具有相同公式的表格,您可以为您的公式使用 R1C1 公式表示法并将整列设置为等于要设置的公式字符串一通电话就搞定。

app.ReferenceStyle = xlR1C1
app.ActiveSheet.Columns(2) = "=SUBSTITUTE(C[-1],"foo","bar")"

Also, creating XLL add-ins using ExcelDNA & .NET (or the hard way in C) is also the only way you can get UDFs to run on multiple threads. (See Excel DNA's ExcelFunction attribute's IsThreadSafe property.)

此外,使用 ExcelDNA 和 .NET(或 C 中的困难方法)创建 XLL 加载项也是让 UDF 在多个线程上运行的唯一方法。(请参阅 Excel DNA 的 ExcelFunction 属性的 IsThreadSafe 属性。)

Before I transitioned to Excel DNA completely, I also experimented with creating COM visible libraries in .NET to reference in VBA projects. Heavy text processing is a bit faster than VBA that way, as are using wrapped .NET List classes instead of VBA's Collection, but Excel DNA is better.

在我完全过渡到 Excel DNA 之前,我还尝试在 .NET 中创建 COM 可见库以在 VBA 项目中引用。在这种情况下,重文本处理比 VBA 快一点,因为使用包装的 .NET List 类而不是 VBA 的集合,但 Excel DNA 更好。