在 C# 中解析 Excel 文件,单元格似乎在 255 个字符处被截断......我该如何阻止?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/926453/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-06 03:23:21  来源:igfitidea点击:

Parsing an Excel file in C#, the cells seem to get cut off at 255 characters... how do I stop that?

c#linqexcelexcel-2007xlsx

提问by naspinski

I am parsing through an uploaded excel files (xlsx) in asp.net with c#. I am using the following code (simplified):

我正在使用 c# 在 asp.net 中解析上传的 excel 文件 (xlsx)。我正在使用以下代码(简化):

string connString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileLocation + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES\";");
OleDbDataAdapter adapter = new OleDbDataAdapter("SELECT * FROM [Sheet1$]", connString);
DataSet ds = new DataSet();
adapter.Fill(ds);
adapter.Dispose();
DataTable dt = ds.Tables[0];
var rows = from p in dt.AsEnumerable() select new { desc = p[2] };

This works perfectly, butif there is anything longer than 255 characters in the cell, it will get cut off. Any idea what I am doing wrong? Thank you.

这工作得很好,但是如果单元格中的任何字符超过 255 个,它就会被切断。知道我做错了什么吗?谢谢你。

EDIT: When viewing the excel sheet, it shows much more than 255 characters, so I don't believe the sheet itself is limited.

编辑:查看 excel 表时,它显示的字符多于 255 个,所以我不认为该表本身是有限的。

采纳答案by Chris Doggett

Just from a quick Googling of the subject, it appears that that's a limit of Excel.

仅从对该主题的快速谷歌搜索来看,这似乎是 Excel 的一个限制。

EDIT: Possible workaround (unfortunately in VB)

编辑可能的解决方法(不幸的是在 VB 中)

回答by James

Have you tried setting the columns datatype to text within the spreadsheet? I believe doing this will allow the cells to contain much more than 255 characters.

您是否尝试在电子表格中将列数据类型设置为文本?我相信这样做将使单元格包含超过 255 个字符。

[Edit] For what it's worth this dialogwith the MS-Excel team is an interesting read. In the comments section at the bottom they get into some discussions about that 255 cutoff. They say Excel 12 can support 32k characters per cell.

[编辑] 值得一读的是,与 MS-Excel 团队的这个对话很有趣。在底部的评论部分,他们讨论了关于 255 截止值的一些讨论。他们说 Excel 12 每个单元格可以支持 32k 个字符。

If that is true there must be a way to get at this data. Here is two things to consider.

如果这是真的,那么必须有一种方法可以获取这些数据。这里有两件事需要考虑。

  1. In the past I have used the "IMEX=1" option in my connection string to deal with columns containing mixed data showing up as empty. It's a longshot, but you might give that a try.

  2. Could you export the file to a tab delimited flat file? IMHO this is the most reliable way of dealing with Excel data, since Excel does have so many gotchas.

  1. 过去,我在连接字符串中使用了“IMEX=1”选项来处理包含显示为空的混合数据的列。这是一个远景,但你可以尝试一下。

  2. 你能把文件导出到制表符分隔的平面文件吗?恕我直言,这是处理 Excel 数据最可靠的方法,因为 Excel 确实有很多问题。

回答by Joe Erickson

SpreadsheetGear for .NETcan read and write (and more) xls and xlsx workbooks and supports the same limitations as Excel for text - in other words it will just work. There is a free evaluation if you want to give it a try.

SpreadsheetGear for .NET可以读取和写入(以及更多)xls 和 xlsx 工作簿,并支持与 Excel 相同的文本限制 - 换句话说,它可以正常工作。如果您想尝试一下,可以免费评估。

Disclaimer: I own SpreadsheetGear LLC

免责声明:我拥有 SpreadsheetGear LLC

回答by Der Wolf

Regarding the last post, I also use SpreadsheetGear and find that it also suffers from the 255 characters per cell limitation when reading from the older XLS (not XLSX) format.

关于上一篇文章,我还使用了 SpreadsheetGear,发现从旧的 XLS(不是 XLSX)格式读取时,它也受到每个单元格 255 个字符的限制。

回答by Andrew Garrison

The Solution!

解决方案!

I've been battling this today as well. I finally got it to work by modifying some registry keys before parsing the Excel spreadsheet.

我今天也一直在与这个作斗争。在解析 Excel 电子表格之前,我终于通过修改一些注册表项来让它工作。

You must update this registry key before parsing the Excel spreadsheet:

您必须在解析 Excel 电子表格之前更新此注册表项:

// Excel 2010
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office.0\Access Connectivity Engine\Engines\Excel\
or
HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\Microsoft\Office.0\Access Connectivity Engine\Engines\Excel\

// Excel 2007
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office.0\Access Connectivity Engine\Engines\Excel\

// Excel 2003
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet.0\Engines\Excel\

Change TypeGuessRowsto 0and ImportMixedTypesto Textunder this key. You'll also need to update your connection string to include IMEX=1in the extended properties:

更改TypeGuessRows0ImportMixedTypesText此项下。您还需要更新连接字符串以包含IMEX=1在扩展属性中:

string connString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" + fileLocation + ";Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1\";");


References

参考

http://blogs.vertigo.com/personal/aanttila/Blog/archive/2008/03/28/excel-and-csv-reference.aspx

http://blogs.vertigo.com/personal/aanttila/Blog/archive/2008/03/28/excel-and-csv-reference.aspx

http://msdn.microsoft.com/en-us/library/ms141683.aspx

http://msdn.microsoft.com/en-us/library/ms141683.aspx

...characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row.You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key....

...字符可能会被截断。要在不截断的情况下从备注列导入数据,您必须确保至少一个采样行中的备注列包含一个超过 255 个字符的值,或者您必须增加驱动程序采样的行数以包含这样的一个排。您可以通过增加 HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel 注册表项下 TypeGuessRows 的值来增加采样的行数....

回答by Dai Bok

I have came across this, and the solution that worked for me was to move the cells with long text to the top of the spreadsheet.

我遇到过这个问题,对我有用的解决方案是将带有长文本的单元格移动到电子表格的顶部。

I found this comment in a forum describing the issue

我在描述该问题的论坛中发现了此评论

This is an issue with the Jet OLEDB provider. It looks at the first 8 rows
of the spreadsheet to determine the data type in each column. If the column does
not contain a field value over 256 characters in the first 8 rows, then it assumes the
data type is text, which has a character limit of 256. The following KB article has
more information on this issue: http://support.microsoft.com/kb/281517

这是 Jet OLEDB 提供程序的问题。它查看 电子表格的前8 行
以确定每列中的数据类型。如果该列
的前8 行不包含超过 256 个字符的字段值,则假定
数据类型为文本,字符限制为 256。以下知识库文章提供
了有关此问题的更多信息:http:// support.microsoft.com/kb/281517

Hope this help someone else!

希望这对其他人有帮助!