SQL Server BULK INSERT - 插入日期时间值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7135851/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL Server BULK INSERT - Inserting DateTime values
提问by Dai
I have 6 million rows worth of data I want to insert into my SQL Server database. I can do it the slow way with 6 million INSERT statements (by my calculation it would take 18 hours to run) or I can try BULK INSERT.
我有 600 万行数据要插入到我的 SQL Server 数据库中。我可以用 600 万条 INSERT 语句(根据我的计算需要 18 小时运行)以缓慢的方式完成,或者我可以尝试 BULK INSERT。
BULK INSERT has issues with not being able to escape characters, but the data in this case is very simple and so shouldn't run into this problem.
BULK INSERT 存在无法转义字符的问题,但这种情况下的数据非常简单,因此不应遇到此问题。
However SQL Server doesn't seem to like any form of date/time data to be inserted into a field.
但是,SQL Server 似乎不喜欢将任何形式的日期/时间数据插入到字段中。
Here is the table (psuedo-SQL)
这是表(伪SQL)
CREATE TABLE Tasks (
TaskId bigint NOT NULL IDENTITY(1,1) PRIMARY KEY,
TriggerId bigint NOT NULL FOREIGN KEY,
Created datetime NOT NULL,
Modified datetime NOT NULL,
ScheduledFor datetime NULL,
LastRan datetime NULL,
-- and about 10 more fields after this
)
Here is my BULK INSERT statement:
这是我的 BULK INSERT 语句:
SET DATEFORMAT dmy
BULK INSERT Tasks
FROM 'C:\TasksBulk.dat'
WITH (
-- CHECK_CONSTRAINTS is not necessary as the only constraints are always enforced regardless of this option (UNIQUE, PRIMARY KEY, and NOT NULL)
CODEPAGE = 'RAW',
DATAFILETYPE = 'native',
KEEPIDENTITY,
MAXERRORS = 1,
ORDER ( CallId ASC ),
FIELDTERMINATOR = '\t',
ROWTERMINATOR = '1000\t1092\t01/01/2010 04:00:17\t01/01/2010 04:00:17\t\t01/01/2010 04:00:141000 1092 01/01/2010 04:00:17 01/01/2010 04:00:17 01/01/2010 04:00:14CODEPAGE [ = 'ACP' | 'OEM' | 'RAW' | 'code_page' ]
'
)
And here is the first row of data in TasksBulk.dat:
这是 TasksBulk.dat 中的第一行数据:
CODEPAGE [ = 'ACP' | 'OEM' | 'RAW' | 'code_page' ]
(For readability, reformatted with tab characters replaced with 4-spaces:)
(为了可读性,重新格式化为制表符替换为 4 个空格:)
##代码##However when I run the BULK INSERT statement, I get this error:
但是,当我运行 BULK INSERT 语句时,出现此错误:
Msg 4864, Level 16, State 1, Line 2 Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 3 (Created).
消息 4864,级别 16,状态 1,第 2 行,第 1 行第 3 列(已创建)的批量加载数据转换错误(指定代码页的类型不匹配或无效字符)。
I have tried using different row and field terminators and every different date/time format (including "01/01/2010", "2010-01-01", both with and without the "04:00:17" time component). I don't know what I'm doing wrong here.
我尝试使用不同的行和字段终止符以及每种不同的日期/时间格式(包括“01/01/2010”、“2010-01-01”,无论是否带有“04:00:17”时间组件)。我不知道我在这里做错了什么。
采纳答案by Dai
It turns out that changing the DATAFILETYPE from 'native' to 'char' solved the problem. The 'native' type implies a strict data format for everything, whereas 'char' is meant for more plaintext files.
事实证明,将 DATAFILETYPE 从 'native' 更改为 'char' 解决了这个问题。'native' 类型意味着对所有内容都有严格的数据格式,而 'char' 意味着更多的纯文本文件。
回答by Mitch Wheat
You have your CODDEPAGE
set to RAW
(presumably for speed).
你有你的CODDEPAGE
设置RAW
(大概是为了速度)。
The error message implies your data contains characters outside the codepage.
该错误消息暗示您的数据包含代码页之外的字符。
##代码##Specifies the code page of the data in the data file. CODEPAGE is relevant only if the data contains char, varchar, or text columns with character values greater than 127 or less than 32.
##代码##指定数据文件中数据的代码页。仅当数据包含字符值大于 127 或小于 32 的 char、varchar 或 text 列时,CODEPAGE 才相关。
But that could be misleading. Your example data line contains a missing column. If you don't use a format file, every field in the table must be used.
但这可能会产生误导。您的示例数据行包含一个缺失的列。如果不使用格式文件,则必须使用表中的每个字段。
So you could either create a format file or create a staging table with varchar(25) for the datetime columns, importing and then perform an update from the staging table into the destination table. That way you have more control over conversions and missing data.
因此,您可以创建一个格式文件或使用 varchar(25) 为日期时间列创建一个临时表,从临时表导入并执行更新到目标表。这样您就可以更好地控制转换和丢失的数据。
回答by meltdownmonk
A method I'm familiar with is to insert your dates in the form of an integer.
我熟悉的一种方法是以整数形式插入日期。
I use the number of seconds starting from a certain date (I use one over 10 years in the past, since there isn't any data I would be accessing or generating that is older than that)
我使用从某个日期开始的秒数(我使用过去 10 多年的秒数,因为没有任何我将要访问或生成的数据比这更旧)
The date 2012-01-02 12:15:10.000 would be stored as 378637886, using the reference point of January 1st, 2000.
使用 2000 年 1 月 1 日的参考点,日期 2012-01-02 12:15:10.000 将存储为 378637886。
When querying the database, you could have the column returned using DateAdd(SS, column_name, '2000-01-01').
查询数据库时,您可以使用DateAdd(SS, column_name, '2000-01-01')返回列。
You could do this in millieseconds too if that kind of precision was needed.
如果需要这种精度,您也可以在几毫秒内完成此操作。
I use my own custom function to convert my time in seconds into whatever format I would like, and I use another custom function to turn dates back into seconds.
我使用我自己的自定义函数将我的时间以秒为单位转换为我想要的任何格式,并且我使用另一个自定义函数将日期转换回秒。
I realize this may not be a good method because it could require database changes and code changes for you, but perhaps it could be a solution concept that others would find useful.
我意识到这可能不是一个好方法,因为它可能需要为您更改数据库和代码,但也许它可能是其他人会发现有用的解决方案概念。