批量插入,SQL Server 2000,unix 换行符
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/479819/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Bulk insert, SQL Server 2000, unix linebreaks
提问by John Oxley
I am trying to insert a .csv file into a database with unix linebreaks. The command I am running is:
我正在尝试将 .csv 文件插入到带有 unix 换行符的数据库中。我正在运行的命令是:
BULK INSERT table_name
FROM 'C:\file.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
If I convert the file into Windows format the load works, but I don't want to do this extra step if it can be avoided. Any ideas?
如果我将文件转换为 Windows 格式,则加载有效,但如果可以避免,我不想执行此额外步骤。有任何想法吗?
回答by Randy J
I felt compelled to contribute as I was having the same issue, and I need to read 2 UNIX files from SAP at least a couple of times a day. Therefore, instead of using unix2dos, I needed something with less manual intervention and more automatic via programming.
我觉得有必要做出贡献,因为我遇到了同样的问题,我每天至少需要从 SAP 读取 2 个 UNIX 文件。因此,我不需要使用 unix2dos,而是需要一些人工干预更少、通过编程更自动化的东西。
As noted, the Char(10) works within the sql string. I didn't want to use an sql string, and so I used ''''+Char(10)+'''', but for some reason, this didn't compile.
如前所述,Char(10) 在 sql 字符串中工作。我不想使用 sql 字符串,所以我使用了 ''''+Char(10)+'''',但由于某种原因,这没有编译。
What did work very slick was: with (ROWTERMINATOR = '0x0a')
什么工作得非常流畅: with (ROWTERMINATOR = '0x0a')
Problem solved with Hex!
用十六进制解决了问题!
Hope this helps someone.
希望这可以帮助某人。
回答by John Oxley
Thanks to all who have answered but I found my preferred solution.
感谢所有回答的人,但我找到了我喜欢的解决方案。
When you tell SQL Server ROWTERMINATOR='\n' it interprets this as meaning the default row terminator under Windows which is actually "\r\n" (using C/C++ notation). If your row terminator is really just "\n" you will have to use the dynamic SQL shown below.
当您告诉 SQL Server ROWTERMINATOR='\n' 时,它会将其解释为 Windows 下的默认行终止符,实际上是“\r\n”(使用 C/C++ 表示法)。如果您的行终止符实际上只是“\n”,您将不得不使用如下所示的动态 SQL。
DECLARE @bulk_cmd varchar(1000)
SET @bulk_cmd = 'BULK INSERT table_name
FROM ''C:\file.csv''
WITH (FIELDTERMINATOR = '','', ROWTERMINATOR = '''+CHAR(10)+''')'
EXEC (@bulk_cmd)
Why you can't say BULK INSERT ...(ROWTERMINATOR = CHAR(10)) is beyond me. It doesn't look like you can evaluate any expressions in the WITH section of the command.
为什么你不能说 BULK INSERT ...(ROWTERMINATOR = CHAR(10)) 超出我的范围。看起来您无法计算命令的 WITH 部分中的任何表达式。
What the above does is create a string of the command and execute that. Neatly sidestepping the need to create an additional file or go through extra steps.
上面所做的是创建一个命令字符串并执行它。巧妙地回避了创建附加文件或执行额外步骤的需要。
回答by kr1t1kz
I confirm that the syntax
我确认语法
ROWTERMINATOR = '''+CHAR(10)+'''
works when used with an EXEC command.
与 EXEC 命令一起使用时有效。
If you have multiple ROWTERMINATOR characters (e.g. a pipe and a unix linefeed) then the syntax for this is:
如果您有多个 ROWTERMINATOR 字符(例如管道和 unix 换行符),则其语法为:
ROWTERMINATOR = '''+CHAR(124)+''+CHAR(10)+'''
回答by kr1t1kz
It's a bit more complicated than that! When you tell SQL Server ROWTERMINATOR='\n' it interprets this as meaning the default row terminator under Windows which is actually "\r\n" (using C/C++ notation). If your row terminator is really just "\n" you will have to use the dynamic SQL shown above. I have just spent the best part of an hour figuring out why \n doesn't really mean \n when used with BULK INSERT!
它比那要复杂一些!当您告诉 SQL Server ROWTERMINATOR='\n' 时,它会将其解释为 Windows 下的默认行终止符,实际上是“\r\n”(使用 C/C++ 表示法)。如果您的行终止符实际上只是“\n”,您将不得不使用上面显示的动态 SQL。我刚刚花了一个小时的大部分时间来弄清楚为什么 \n 在与 BULK INSERT 一起使用时并不真正意味着 \n!
回答by ConcernedOfTunbridgeWells
One option would be to use bcp, and set up a control file with '\n'
as the line break character.
一种选择是使用bcp,并设置一个控制文件'\n'
作为换行符。
Although you've indicated that you would prefer not to, another option would be to use unix2dosto pre-process the file into one with '\r\n'
line breaks.
尽管您已表示不想这样做,但另一种选择是使用unix2dos将文件预处理为带有'\r\n'
换行符的文件。
Finally, you can use the FORMATFILE
option on BULK INSERT
. This will use a bcp control file to specify the import format.
最后,您可以使用 上的FORMATFILE
选项BULK INSERT
。这将使用 bcp 控制文件来指定导入格式。
回答by Philippe Payant
Looks to me there are two general avenues that can be taken: some alternate way to read the CSV in the SQL script or convert the CSV beforehand with any of the numerous ways you can do that (bcp, unix2dos, if it is a one-time king of a thing, you can probably even use your code editor to fix the file for you).
在我看来,可以采取两种一般途径:在 SQL 脚本中读取 CSV 的某种替代方法,或者使用多种方法(bcp、unix2dos,如果它是一种 -时间之王,您甚至可以使用代码编辑器为您修复文件)。
But you will have to have an extra step!
但是你必须有一个额外的步骤!
If this SQL is launched from a program, you might want to convert the line endings in that program. In that case and you decide to code the conversion yourself, here is what you need to watch out for: 1. The line ending might be \n 2. or \r\n 3. or even \r (Mac!) 4. good grief, it could be that some lines have \r\n and others \n, any combination is possible unless you control where the CSV came from
如果此 SQL 是从程序启动的,您可能希望转换该程序中的行尾。在这种情况下,您决定自己对转换进行编码,以下是您需要注意的事项: 1. 行尾可能是 \n 2. 或 \r\n 3. 甚至 \r (Mac!) 4.很遗憾,可能有些行有 \r\n 和其他的 \n,除非您控制 CSV 的来源,否则任何组合都是可能的
OK, OK. Possibility 4 is farfetched. It happens in email, but that is another story.
好的好的。可能性 4 太牵强了。它发生在电子邮件中,但那是另一回事。
回答by BankZ
I would think "ROWTERMINATOR = '\n'" would work. I would suggest opening the file in a tool that shows "hidden characters" to make sure the line is being terminated like you think. I use notepad++ for things like this.
我认为“ROWTERMINATOR = '\n'”会起作用。我建议在显示“隐藏字符”的工具中打开文件,以确保该行像您想的那样被终止。我使用记事本++来处理这样的事情。
回答by BIBD
It comes down to this. Unix uses LF (ctrl-J), MS-DOS/Windows uses CR/LF (ctrl-M/Ctrl-J).
这归结为这一点。Unix 使用 LF (ctrl-J),MS-DOS/Windows 使用 CR/LF (ctrl-M/Ctrl-J)。
When you use '\n' on Unix, it gets translated to a LF character. On MS-DOS/Windows it gets translated to CR/LF. When the your import runs on the Unix formatted file, it sees only a LF. Hence, its often easier to run the file through unix2dos first. But as you said in you original question, you don't want to do this (I'll assume there is a good reason why you can't).
当您在 Unix 上使用 '\n' 时,它会被转换为 LF 字符。在 MS-DOS/Windows 上,它被转换为 CR/LF。当您的导入在 Unix 格式的文件上运行时,它只看到一个 LF。因此,首先通过 unix2dos 运行文件通常更容易。但是正如您在最初的问题中所说,您不想这样做(我认为您有充分的理由不能这样做)。
Why can't you do:
为什么你不能这样做:
(ROWTERMINATOR = CHAR(10))
Probably because when the SQL code is being parsed, it is not replacing the char(10) with the LF character (because it's already encased in single-quotes). Or perhaps its being interpreted as:
可能是因为在解析 SQL 代码时,它没有将 char(10) 替换为 LF 字符(因为它已经包含在单引号中)。或者它可能被解释为:
(ROWTERMINATOR =
)
What happens when you echo out the contents of @bulk_cmd?
当你回显@bulk_cmd 的内容时会发生什么?