如何在不丢失任何列的情况下读取 c# 中的 excel 文件?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/571011/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do I read an excel file in c# without missing any columns?
提问by Austin
I've been using an OleDb connection to read excel files successfully for quite a while now, but I've run across a problem. I've got someone who is trying to upload an Excel spreadsheet with nothing in the first column and when I try to read the file, it doesn't recognize that column.
一段时间以来,我一直在使用 OleDb 连接成功读取 excel 文件,但是我遇到了一个问题。我有人试图上传第一列中没有任何内容的 Excel 电子表格,当我尝试读取该文件时,它无法识别该列。
I'm currently using the following OleDb connection string:
我目前使用以下 OleDb 连接字符串:
Provider=Microsoft.Jet.OLEDB.4.0;
Data Source=c:\test.xls;
Extended Properties="Excel 8.0;IMEX=1;"
提供商=Microsoft.Jet.OLEDB.4.0;
数据源=c:\test.xls;
扩展属性="Excel 8.0;IMEX=1;"
So, if there are 13 columns in the excel file, the OleDbDataReader I get back only has 12 columns/fields.
因此,如果 excel 文件中有 13 列,我返回的 OleDbDataReader 只有 12 列/字段。
Any insight would be appreciated.
任何见解将不胜感激。
采纳答案by Joe Erickson
SpreadsheetGear for .NETgives you an API for working with xls and xlsx workbooks from .NET. It is easier to use and faster than OleDB or the Excel COM object model. You can see the live samplesor try it for yourself with the free trial.
SpreadsheetGear for .NET为您提供了一个 API,用于处理来自 .NET 的 xls 和 xlsx 工作簿。它比 OleDB 或 Excel COM 对象模型更易于使用且速度更快。您可以查看实时样本或通过免费试用亲自试用。
Disclaimer: I own SpreadsheetGear LLC
免责声明:我拥有 SpreadsheetGear LLC
EDIT:
编辑:
StingyHyman commented "Faster than OleDb? Better back that claim up".
StingyHyman 评论说“比 OleDb 还快?更好地支持这一说法”。
This is a reasonable request. I see claims all the time which I know for a fact to be false, so I cannot blame anyone for being skeptical.
这是一个合理的要求。我一直看到我知道的事实是错误的说法,所以我不能责怪任何人持怀疑态度。
Below is the code to create a 50,000 row by 10 column workbook with SpreadsheetGear, save it to disk, and then sum the numbers using OleDb and SpreadsheetGear. SpreadsheetGear reads the 500K cells in 0.31 seconds compared to 0.63 seconds with OleDB - just over twice as fast. SpreadsheetGear actually creates and reads the workbook in less time than it takes to read the workbook with OleDB.
下面是使用 SpreadsheetGear 创建 50,000 行 x 10 列工作簿的代码,将其保存到磁盘,然后使用 OleDb 和 SpreadsheetGear 对数字求和。与 OleDB 的 0.63 秒相比,SpreadsheetGear 在 0.31 秒内读取 500K 单元格 - 快两倍多。与使用 OleDB 读取工作簿相比,SpreadsheetGear 实际上创建和读取工作簿所需的时间更少。
The code is below. You can try it yourself with the SpreadsheetGear free trial.
代码如下。您可以通过 SpreadsheetGear 免费试用版自行尝试。
using System;
using System.Data;
using System.Data.OleDb;
using SpreadsheetGear;
using SpreadsheetGear.Advanced.Cells;
using System.Diagnostics;
namespace SpreadsheetGearAndOleDBBenchmark
{
class Program
{
static void Main(string[] args)
{
// Warm up (get the code JITed).
BM(10, 10);
// Do it for real.
BM(50000, 10);
}
static void BM(int rows, int cols)
{
// Compare the performance of OleDB to SpreadsheetGear for reading
// workbooks. We sum numbers just to have something to do.
//
// Run on Windows Vista 32 bit, Visual Studio 2008, Release Build,
// Run Without Debugger:
// Create time: 0.25 seconds
// OleDb Time: 0.63 seconds
// SpreadsheetGear Time: 0.31 seconds
//
// SpreadsheetGear is more than twice as fast at reading. Furthermore,
// SpreadsheetGear can create the file and read it faster than OleDB
// can just read it.
string filename = @"C:\tmp\SpreadsheetGearOleDbBenchmark.xls";
Console.WriteLine("\nCreating {0} rows x {1} columns", rows, cols);
Stopwatch timer = Stopwatch.StartNew();
double createSum = CreateWorkbook(filename, rows, cols);
double createTime = timer.Elapsed.TotalSeconds;
Console.WriteLine("Create sum of {0} took {1} seconds.", createSum, createTime);
timer = Stopwatch.StartNew();
double oleDbSum = ReadWithOleDB(filename);
double oleDbTime = timer.Elapsed.TotalSeconds;
Console.WriteLine("OleDb sum of {0} took {1} seconds.", oleDbSum, oleDbTime);
timer = Stopwatch.StartNew();
double spreadsheetGearSum = ReadWithSpreadsheetGear(filename);
double spreadsheetGearTime = timer.Elapsed.TotalSeconds;
Console.WriteLine("SpreadsheetGear sum of {0} took {1} seconds.", spreadsheetGearSum, spreadsheetGearTime);
}
static double CreateWorkbook(string filename, int rows, int cols)
{
IWorkbook workbook = Factory.GetWorkbook();
IWorksheet worksheet = workbook.Worksheets[0];
IValues values = (IValues)worksheet;
double sum = 0.0;
Random rand = new Random();
// Put labels in the first row.
foreach (IRange cell in worksheet.Cells[0, 0, 0, cols - 1])
cell.Value = "Cell-" + cell.Address;
// Using IRange and foreach be less code,
// but we'll do it the fast way.
for (int row = 1; row <= rows; row++)
{
for (int col = 0; col < cols; col++)
{
double number = rand.NextDouble();
sum += number;
values.SetNumber(row, col, number);
}
}
workbook.SaveAs(filename, FileFormat.Excel8);
return sum;
}
static double ReadWithSpreadsheetGear(string filename)
{
IWorkbook workbook = Factory.GetWorkbook(filename);
IWorksheet worksheet = workbook.Worksheets[0];
IValues values = (IValues)worksheet;
IRange usedRahge = worksheet.UsedRange;
int rowCount = usedRahge.RowCount;
int colCount = usedRahge.ColumnCount;
double sum = 0.0;
// We could use foreach (IRange cell in usedRange) for cleaner
// code, but this is faster.
for (int row = 1; row <= rowCount; row++)
{
for (int col = 0; col < colCount; col++)
{
IValue value = values[row, col];
if (value != null && value.Type == SpreadsheetGear.Advanced.Cells.ValueType.Number)
sum += value.Number;
}
}
return sum;
}
static double ReadWithOleDB(string filename)
{
String connectionString =
"Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + filename + ";" +
"Extended Properties=Excel 8.0;";
OleDbConnection connection = new OleDbConnection(connectionString);
connection.Open();
OleDbCommand selectCommand =new OleDbCommand("SELECT * FROM [Sheet1$]", connection);
OleDbDataAdapter dataAdapter = new OleDbDataAdapter();
dataAdapter.SelectCommand = selectCommand;
DataSet dataSet = new DataSet();
dataAdapter.Fill(dataSet);
connection.Close();
double sum = 0.0;
// We'll make some assumptions for brevity of the code.
DataTable dataTable = dataSet.Tables[0];
int cols = dataTable.Columns.Count;
foreach (DataRow row in dataTable.Rows)
{
for (int i = 0; i < cols; i++)
{
object val = row[i];
if (val is double)
sum += (double)val;
}
}
return sum;
}
}
}
回答by Beep beep
We always use Excel Interop to open the spreadsheet and parse directly (e.g. similar to how you would scan through cells in VBA), or we create locked down templates that enforce certain columns to be filled in before the user can save the data.
我们总是使用 Excel Interop 打开电子表格并直接解析(例如类似于您在 VBA 中扫描单元格的方式),或者我们创建锁定模板,强制在用户保存数据之前填充某些列。
回答by StingyHyman
If could require the format of the excel sheet to have column headers, then you would always have the 13 columns. You would just need to skip the header row when processing.
如果可能要求 Excel 工作表的格式具有列标题,那么您将始终拥有 13 列。您只需要在处理时跳过标题行。
This would also correct situations where the user puts the columns in an order that you are not expecting. (detect column indexes in the header row and read appropriately)
这也将纠正用户按您不期望的顺序放置列的情况。(检测标题行中的列索引并正确读取)
I see that others are recommending the Excel interop, but jeez that's a slow option compared to the OleDb way. Plus it requires Excel or OWC to be installed on the server (licensing).
我看到其他人正在推荐 Excel 互操作,但是与 OleDb 方式相比,这是一个缓慢的选择。此外,它需要在服务器上安装 Excel 或 OWC(许可)。
回答by StingyHyman
You might try using Excel and COM. That way, you'll be getting your info straight form the horse's mouth, as it were.
您可以尝试使用 Excel 和 COM。这样,您就可以直接从马口中获取信息。
From D. Anand over on the MSDN forums:
来自 D. Anand 在 MSDN 论坛上:
Create a reference in your project to Excel Objects Library. The excel object library can be added in the COM tab of adding reference dialog.
在您的项目中创建对 Excel 对象库的引用。可以在添加引用对话框的 COM 选项卡中添加 excel 对象库。
Here's some info on the Excel object model in C# http://msdn.microsoft.com/en-us/library/aa168292(office.11).aspx
以下是 C# 中 Excel 对象模型的一些信息 http://msdn.microsoft.com/en-us/library/aa168292(office.11).aspx
回答by abatishchev
I recommend you to try Visual Studio Tools for Office and Excel Interop! It's using is very easy.
我建议您尝试 Visual Studio Tools for Office 和 Excel Interop!它的使用非常简单。
回答by abatishchev
You can probably look at ExcelMapper. It is a tool to read excel files as strongly typed objects. It hides all the details of reading an excel from your code. It would take care if your excel is missing a column or data is missing from a column. You read data that you are interested in. You can get the code/executable for ExcelMapper from http://code.google.com/p/excelmapper/.
你大概可以看看 ExcelMapper。它是一种将 excel 文件作为强类型对象读取的工具。它隐藏了从代码中读取 excel 的所有细节。如果您的 excel 缺少列或列中缺少数据,则需要小心。您阅读了您感兴趣的数据。您可以从http://code.google.com/p/excelmapper/获取 ExcelMapper 的代码/可执行文件。