sql server批量插入带有逗号的数据的csv

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21226107/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 00:48:21  来源:igfitidea点击:

sql server Bulk insert csv with data having comma

sqlsql-servercsvbulkinsert

提问by sumit

below is the sample line of csv

下面是 csv 的示例行

012,12/11/2013,"<555523051548>KRISHNA  KUMAR  ASHOKU,AR",<10-12-2013>,555523051548,12/11/2013,"13,012.55",

you can see KRISHNA KUMAR ASHOKU,ARas single field but it is treating KRISHNA KUMAR ASHOKU and AR as two different fields because of comma, though they are enclosed with " but still no luck

您可以将KRISHNA KUMAR ASHOKU,AR视为单个字段,但由于逗号,它将 KRISHNA KUMAR ASHOKU 和 AR 视为两个不同的字段,尽管它们用 "​​ 括起来,但仍然没有运气

I tried

我试过

BULK
INSERT tbl
FROM 'd:.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
FIRSTROW=2
)
GO

is there any solution for it?

有什么解决办法吗?

回答by david.pfx

The answer is: you can't do that. See http://technet.microsoft.com/en-us/library/ms188365.aspx.

答案是:你不能那样做。请参阅http://technet.microsoft.com/en-us/library/ms188365.aspx

"Importing Data from a CSV file

"从 CSV 文件导入数据

Comma-separated value (CSV) files are not supported by SQL Server bulk-import operations. However, in some cases, a CSV file can be used as the data file for a bulk import of data into SQL Server. For information about the requirements for importing data from a CSV data file, see Prepare Data for Bulk Export or Import (SQL Server)."

SQL Server 大容量导入操作不支持逗号分隔值 (CSV) 文件。但是,在某些情况下,CSV 文件可用作将数据批量导入 SQL Server 的数据文件。有关从 CSV 数据文件导入数据的要求的信息,请参阅为批量导出或导入准备数据 (SQL Server)。”

The general solution is that you must convert your CSV file into one that can be be successfully imported. You can do that in many ways, such as by creating the file with a different delimiter (such as TAB) or by importing your table using a tool that understands CSV files (such as Excel or many scripting languages) and exporting it with a unique delimiter (such as TAB), from which you can then BULK INSERT.

一般的解决方案是您必须将您的 CSV 文件转换为可以成功导入的文件。您可以通过多种方式执行此操作,例如通过使用不同的分隔符(例如 TAB)创建文件或使用理解 CSV 文件(例如 Excel 或许多脚本语言)的工具导入您的表格并使用唯一的分隔符(例如 TAB),然后您可以从中批量插入。

回答by zee

Unfortunately , SQL Server Import methods( BCP && BULK INSERT) do not understand quoting " "

不幸的是,SQL Server 导入方法(BCP && BULK INSERT)不理解引用“”

Source : http://msdn.microsoft.com/en-us/library/ms191485%28v=sql.100%29.aspx

来源:http: //msdn.microsoft.com/en-us/library/ms191485%28v=sql.100%29.aspx

回答by gregarius1111

I have encountered this problem recently and had to switch to tab-delimited format. If you do that and use the SQL Server Management Studio to do the import (Right-click on database, then select Tasks, then Import) tab-delimited works just fine. The bulk insert option with tab-delimited should also work.

我最近遇到了这个问题,不得不切换到制表符分隔的格式。如果您这样做并使用 SQL Server Management Studio 进行导入(右键单击数据库,然后选择任务,然后选择导入)制表符分隔的工作就好了。带有制表符分隔的批量插入选项也应该有效。

I must admit to being very surprised when finding out that Microsoft SQL Server had this comma-delimited issue. The CSV file format is a very old one, so finding out that this was an issue with a modern database was very disappointing.

当我发现 Microsoft SQL Server 有这个逗号分隔的问题时,我必须承认我感到非常惊讶。CSV 文件格式是一种非常古老的格式,因此发现这是现代数据库的问题非常令人失望。

回答by Geoff Griswald

MS have now addressed this issue and you can use FIELDQUOTE in your with clause to add quoted string support:

MS 现在已经解决了这个问题,您可以在 with 子句中使用 FIELDQUOTE 来添加带引号的字符串支持:

FIELDQUOTE = '"',

anywhere in your with clause should do the trick, if you have SQL Server 2017 or above.

如果你有 SQL Server 2017 或更高版本,你的 with 子句中的任何地方都应该可以解决问题。

回答by HelpfulH4cker

They added support for this SQL Server 2017 (14.x) CTP 1.1. You need to use the FORMAT = 'CSV' Input File Option for the BULK INSERT command.

他们增加了对此 SQL Server 2017 (14.x) CTP 1.1 的支持。您需要对 BULK INSERT 命令使用 FORMAT = 'CSV' 输入文件选项。

To be clear, here is what the csv looks like that was giving me problems, the first line is easy to parse, the second line contains the curve ball since there is a comma inside the quoted field:

需要明确的是,这是给我带来问题的 csv 的样子,第一行很容易解析,第二行包含曲线球,因为引用的字段中有一个逗号:

jenkins-2019-09-25_cve-2019-10401,CVE-2019-10401,4,Jenkins Advisory 2019-09-25: CVE-2019-10401: 
jenkins-2019-09-25_cve-2019-10403_cve-2019-10404,"CVE-2019-10404,CVE-2019-10403",4,Jenkins Advisory 2019-09-25: CVE-2019-10403: CVE-2019-10404: 

Broken Code

损坏的代码

BULK INSERT temp
    FROM 'c:\test.csv'
    WITH
    (
        FIELDTERMINATOR = ',',
        ROWTERMINATOR = '0x0a',
        FIRSTROW= 2
    );

Working Code

工作代码

BULK INSERT temp
    FROM 'c:\test.csv'
    WITH
    (
        FIELDTERMINATOR = ',',
        ROWTERMINATOR = '0x0a',
        FORMAT = 'CSV',
        FIRSTROW= 2
    );

回答by ASH

Well, Bulk Insert is very fast but not very flexible. Can you load the data into a staging table and then push everything into a production table? Once in SQL Server, you will have a lot more control in how you move data from one table to another. So, basically.

好吧,批量插入非常快但不是很灵活。您能否将数据加载到临时表中,然后将所有内容推送到生产表中?进入 SQL Server 后,您将在如何将数据从一个表移动到另一个表方面拥有更多控制权。所以,基本上。

1)    Load data into staging
2)    Clean/Convert by copying to a second staging table defined using the desired datatypes. Good data copied over, bad data left behind
3)    Copy data from the "clean" table to the "live" table