vb.net 从 Excel 导入数据集时为列值插入 NULL
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/27855937/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
NULL being inserted for column values when importing from Excel to DataSet
提问by Mudassir Hasan
I am importing excel data to a datatable in my application and facing problem for some particular column values.
我正在将 excel 数据导入我的应用程序中的数据表,但遇到了某些特定列值的问题。
Some cells in a excel sheet column CustomerUniqIDshow warning with green mark in corner.
Excel 工作表列CustomerUniqID中的某些单元格在角落显示带有绿色标记的警告。
The number is formatted as text or preceded by apostrophe.
数字格式为文本或前面有撇号。
These cells value are not imported and show blank value when dataset is filled from Excel sheet.
当从 Excel 工作表填充数据集时,这些单元格值不会被导入并显示空白值。
Dim query As String = "SELECT CINT(CustomerUniqID),[Status] FROM [Sheet1$]"
Dim conn As New OleDbConnection(conStr)
If conn.State = ConnectionState.Closed Then
conn.Open()
End If
Dim cmd As New OleDbCommand(query, conn)
Dim da As New OleDbDataAdapter(cmd)
Dim ds As New DataSet()
da.Fill(ds)
My connection string is
我的连接字符串是
<add name ="Excel07ConString" connectionString="Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties='Excel 12.0;HDR=YES;IMEX=2'"/>
The column CustomerUniqIDcontains numbers and I am unable to import these cells value. How to do this ?
CustomerUniqID列包含数字,我无法导入这些单元格值。这该怎么做 ?
回答by ??ssa P?ngj?rdenlarp
The only way I could get it to fail as described in the original post, is if the escaped/text cells were further down than what I was originally testing. It is too bad that OleDB wont use a Schema.iniwith an excel file because that would allow a very clean and simple solution, alas...
我可以让它像原始帖子中描述的那样失败的唯一方法是转义/文本单元格比我最初测试的更远。OleDB 不会将 aSchema.ini与 excel 文件一起使用,这太糟糕了,因为这将允许一个非常干净和简单的解决方案,唉...
Sample data used:
使用的样本数据:
Country Capital Population Rank
France Paris 2.25 7
Canada Toronto 2.5 6
Egypt Cairo 10.2 9
...
It actually uses 16 rows with the last 3 "Rank" items being escaped as text (e.g. '2). These all show the green corner warning sign in Excel.
它实际上使用了 16 行,最后 3 个“等级”项目被转义为文本(例如'2)。这些都显示 Excel 中的绿色角落警告标志。
Since OleDB does not read/use a Schema, it decides the data type for each column from the first N rows (defined as 8 in my registry). When the escaped cells do not match that, it returns a DBNull value. Attempts to convert the column via SQL (CInt, Val) fail because OleDB has already decided that the data there does not match before the conversion can be applied.
由于 OleDB 不读取/使用架构,它从前 N 行(在我的注册表中定义为 8)决定每一列的数据类型。当转义的单元格不匹配时,它返回一个 DBNull 值。尝试通过 SQL ( CInt, Val)转换列失败,因为 OleDB 在应用转换之前已经确定那里的数据不匹配。
In some cases, I would read the sheet in twice. First getting the "good" columns in their proper datatype into one DataTable; then again getting the 'dirty' column as text and manually converting the data. This would be useful in cases where there are other numeric columns are in the data set and you dont want them to be converted to text/string.
在某些情况下,我会读两遍。首先将正确数据类型中的“好”列合并为一个DataTable;然后再次将“脏”列作为文本并手动转换数据。这在数据集中有其他数字列并且您不希望它们被转换为文本/字符串的情况下很有用。
For the case posted, if there really are only 2 columns involved, you should be able to use one table read in as text; and add a numeric column to receive the converted value. Rather than converting from one table to another, convert from one column to another. (Just ask, if you want an example, but it is just a subset of the following).
对于贴出的案例,如果真的只有2列,你应该可以使用一张表作为文本读入;并添加一个数字列以接收转换后的值。不是从一个表转换到另一个表,而是从一列转换到另一列。(请问,如果您想要一个示例,但它只是以下内容的一个子集)。
In either case, the "trick" is to use a different connection string to force OleDB to read the data as text. Apparently both HDR=Noand IMEX=1are needed for this, at least with my config:
在任何一种情况下,“技巧”都是使用不同的连接字符串来强制 OleDB 将数据作为文本读取。显然,两者HDR=No并IMEX=1需要为此,至少在我的配置:
Dim TextConStr = "Provider=Microsoft.ACE.OLEDB.12.0; Data Source=C:\Temp\capitals.xls;Extended Properties='Excel 8.0;HDR=NO;IMEX=1';"
This example/text code uses the 2 table approach to verify that other numerics (Population) are not converted, just Rank:
此示例/文本代码使用 2 表方法来验证其他数字 ( Population) 未转换,只是Rank:
' ConStr to allow OleDB to guess the datatypes
Dim TypedConStr = "Provider=Microsoft.ACE.OLEDB.12.0; Data Source=C:\Temp\capitals.xls;Extended Properties='Excel 8.0;HDR=Yes;IMEX=2';"
' ConStr to force OleDB to read it all as Text
Dim TextConStr = "Provider=Microsoft.ACE.OLEDB.12.0; Data Source=C:\Temp\capitals.xls;Extended Properties='Excel 8.0;HDR=NO;IMEX=1';"
' get the typed columns into a DT - skip Rank as dirty column
Dim SQL = "SELECT Country, Capital, Population FROM [Capitals$]"
Using con As New OleDbConnection(TypedConStr),
da As New OleDbDataAdapter(SQL, con)
dsPop.Tables.Add("Pop")
da.Fill(dsPop.Tables("Pop"))
End Using
' create a new temp DT containing just the naughty column
' use the generic F/Field index in the SQL (we told Ole there was no header)
SQL = "SELECT F4 As RankText FROM [Capitals$]"
' create connection forcing the contents to text:
Using con As New OleDbConnection(TextConStr),
da As New OleDbDataAdapter(SQL, con)
dsPop.Tables.Add("RankText")
da.Fill(dsPop.Tables("RankText"))
End Using
' remove the header row
dsPop.Tables("RankText").Rows.RemoveAt(0)
'create a new INT col in Dt(0)
dsPop.Tables("Pop").Columns.Add("Rank", GetType(Int32))
' convert Tbl(1) text to Int and store in Tbl(0)
For n As Integer = 0 To dsPop.Tables(1).Rows.Count - 1
dsPop.Tables("Pop").Rows(n).Item("Rank") =
Convert.ToInt32(dsPop.Tables("RankText").Rows(n).Item(0).ToString)
Next
'optional: remove the [RankText] tbl since we are done with it
dgv.DataSource = dsPop.Tables("Pop")
' report the datatype of the last row rank:
tbDataType.Text = dsPop.Tables("Pop").Rows(14).Item("Rank").GetType.ToString
In the immediate window, the types reported are as expected:
在即时窗口中,报告的类型符合预期:
? dspop.Tables("Pop").Rows(0).Item(2) ' (population - paris)
2.25 {Double}
? dspop.Tables("RankText").Rows(0).Item(0) ' temp table text
"7" {String}
? dspop.Tables("Pop").Rows(0).Item(3) ' converted, merged value
7 {Integer}
For me, OleDB is automatically converting '3to "3". In other words, it is omitting the leading tick/apostrophe when it converts to text. Since there can be numerous possibilities resulting from the combinations of Excel versions and OleDB.ACE and OleDb.Jet, we might want a fall back converter (I wrote this after adding backticks to Excel by accident, maybe it will of value to someone):
对我来说,OleDB 会自动转换'3为"3". 换句话说,它在转换为文本时省略了前导刻度/撇号。由于 Excel 版本与 OleDB.ACE 和 OleDb.Jet 的组合可能会产生多种可能性,我们可能需要一个回退转换器(我是在偶然向 Excel添加回勾后写的,也许它对某人有价值) :
Private Function GetNumericValue(s As String) As Integer
' ToDo add exception handling
If Char.IsDigit(s(0)) Then
Return Convert.ToInt32(s)
Else
Return Convert.ToInt32(
New String(s.ToCharArray(1, s.Length - 1))
)
End If
End Function
It will only examine the first char for a non-numeral, otherwise it might convert "1234 Main Street Suite 56"to 123456which is likely not desirable. The result:
它只会非数字的第一个字符,否则可能转换"1234 Main Street Suite 56"到123456这很可能是不可取的。结果:

Russia, Japan and Portugal were the rows which had the Rank data escaped as text.

俄罗斯、日本和葡萄牙是排名数据作为文本转义的行。
Resources:
资源:
回答by iliketocode
I am familiar with this error that you are talking about, "The number is formatted as text or preceded by apostrophe." There is an application I use on my job that stores all their data as text values, and when I export to excel every numeric field has this happen to it. It has to do with the underlying type of the data. It doesn't matter if you type in a number (ex. 123) into a cell (or even into a database for that matter); what matters is the type that it is being interpreted as. So 123 <> "123" (which is a string). The issue is likely that the database you are trying to load the data into has the field with a numeric type (ex. int) but the program importing the data does not recognize it as a numeric, but instead a text value; thus it is throwing NULLS to compensate.
我很熟悉你所说的这个错误,“数字格式为文本或前面有撇号。” 我在工作中使用了一个应用程序,将所有数据存储为文本值,当我导出到 excel 时,每个数字字段都会发生这种情况。它与数据的基础类型有关。如果您在单元格(甚至是数据库中)中输入数字(例如 123)并不重要;重要的是它被解释为的类型。所以 123 <> "123" (这是一个字符串)。 问题很可能是您尝试将数据加载到的数据库具有数字类型(例如 int)的字段,但导入数据的程序不会将其识别为数字,而是文本值;因此它抛出 NULLS 来补偿。
But this is good in some cases, for example, do any of these numbers have leading zeros? If so, You would WANT to store them as Text values to preserve the leading zeros. If not, a work around would be to (in excel 2010) go to the Data tab => text to columns, and then run through the wizard to get the correct data type. This essentially parses the values. It does not look like the problem is with your connection if everything else is loading correctly. This should be fixable directly in Excel.
但这在某些情况下很好,例如,这些数字中的任何一个都带有前导零吗?如果是这样,您希望将它们存储为文本值以保留前导零。如果没有,解决方法是(在 excel 2010 中)转到“数据”选项卡 => 文本到列,然后运行向导以获取正确的数据类型。这基本上解析了值。如果其他所有内容都正确加载,则问题看起来不像是您的连接。这应该可以直接在 Excel 中修复。
回答by Nima Derakhshanjan
回答by Marcin - user2676388
I came across same issue and almost gave up , but I tried this:
我遇到了同样的问题,几乎放弃了,但我试过这个:
Extended Properties=\"Excel 12.0 Xml;HDR=Yes;IMEX=1;ImportMixedTypes=Text;TypeGuessRows=0\"";
扩展属性=\"Excel 12.0 Xml;HDR=Yes;IMEX=1;ImportMixedTypes=Text;TypeGuessRows=0\"";
and it worked. This one is from Jet.OLEDB but it works with ACE.
它奏效了。这是来自 Jet.OLEDB 但它适用于 ACE。
"Check out the [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel] located registry REG_DWORD "TypeGuessRows". That's the key to not letting Excel use only the first 8 rows to guess the columns data type. Set this value to 0 to scan all rows. This might hurt performance. Please also note that adding the IMEX=1 option might cause the IMEX feature to set in after just 8 rows. Use IMEX=0 instead to be sure to force the registry TypeGuessRows=0 (scan all rows) to work."
“检查 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel] 位于注册表 REG_DWORD“TypeGuessRows”。这是不让 Excel 仅使用前 8 行来猜测列数据类型的关键。将此值设置为0 扫描所有行。这可能会影响性能。另请注意,添加 IMEX=1 选项可能会导致 IMEX 功能在仅 8 行之后设置。使用 IMEX=0 来确保强制注册表 TypeGuessRows=0 (扫描所有行)以工作。”
Taken from https://www.connectionstrings.com/excel/
回答by TGlatzer
Your problem is the data access to excel. The jet engine interprets the column data by the first 8 (or something similar) rows. So if the first 8 cells (excluding headers) are numbers, the jet engine will assume a number type for that column. Every following cell, which does not match that data-type will be NULL.
您的问题是对excel的数据访问。喷气引擎通过前 8 行(或类似行)解释列数据。因此,如果前 8 个单元格(不包括标题)是数字,则喷气发动机将为该列假定数字类型。与该数据类型不匹配的每个后续单元格都将是NULL.
You can find more information here: I need a workaround for Excel Guessing Data Types problem
您可以在此处找到更多信息:我需要 Excel 猜测数据类型问题的解决方法
回答by A Beginner
The issue may be due to you tried to convert a nonnumeric value to an integral value. For expression, if we are use following expression: =CInt(“ABC”), we would get the #Error value in our report.
该问题可能是由于您试图将非数字值转换为整数值。对于表达式,如果我们使用以下表达式:=CInt(“ABC”),我们将在报告中获得#Error 值。
Please refer to the custom code below to work around the issue:
请参考下面的自定义代码来解决此问题:
Public Function Conv(ByVal A as String)
Dim B as string
Dim C as Integer
If isnumeric(A) Then
C=CInt(A)
Return C
else
B=CStr(A)
Return B
End If
End Function

