从 CSV 批量导入 SQL
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/96448/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL Bulk import from CSV
提问by Radu094
I need to import a large CSV file into an SQL server. I'm using this :
我需要将一个大型 CSV 文件导入 SQL 服务器。我正在使用这个:
BULK
INSERT CSVTest
FROM 'c:\csvfile.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
GO
problem is all my fields are surrounded by quotes (" ") so a row actually looks like :
问题是我所有的字段都被引号(“”)包围,所以一行实际上看起来像:
"1","","2","","sometimes with comma , inside", ""
Can I somehow bulk import them and tell SQL to use the quotes as field delimiters?
我可以以某种方式批量导入它们并告诉 SQL 使用引号作为字段分隔符吗?
Edit: The problem with using '","' as delimiter, as in the examples suggested is that : What most examples do, is they import the data including the first " in the first column and the last " in the last, then they go ahead and strip that out. Alas my first (and last) column are datetime and will not allow a "20080902 to be imported as datetime.
编辑:使用 '","' 作为分隔符的问题,如建议的示例中所示:大多数示例所做的是导入数据,包括第一列中的第一个“和最后一个”,然后它们去吧,把它去掉。唉,我的第一列(也是最后一列)是日期时间,并且不允许将“20080902 导入为日期时间”。
From what I've been reading arround I think FORMATFILE is the way to go, but documentation (including MSDN) is terribly unhelpfull.
从我一直在阅读的内容来看,我认为 FORMATFILE 是要走的路,但文档(包括 MSDN)非常无用。
采纳答案by Alex Andronov
I know this isn't a real solution but I use a dummy table for the import with nvarchar set for everything. Then I do an insert which strips out the " characters and does the conversions. It isn't pretty but it does the job.
我知道这不是一个真正的解决方案,但我使用一个虚拟表进行导入,并为所有内容设置了 nvarchar。然后我做了一个插入,去掉 " 字符并进行转换。它不漂亮,但它完成了工作。
回答by K Richard
Try FIELDTERMINATOR='","'
尝试 FIELDTERMINATOR='","'
Here is a great link to help with the first and last quote...look how he used the substring the SP
这是一个很好的链接,可以帮助处理第一个和最后一个引号……看看他如何使用 SP 的子字符串
http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-file
http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-file
回答by cbp
Another hack which I sometimes use, is to open the CSV in Excel, then write your sql statement into a cell at the end of each row. For example:
我有时使用的另一个技巧是在 Excel 中打开 CSV,然后将您的 sql 语句写入每行末尾的单元格中。例如:
=concatenate("insert into myTable (columnA,columnB) values ('",a1,"','",b1,"'")")
A fill-down can populate this into every row for you. Then just copy and paste the output into a new query window.
填充可以为您填充到每一行。然后只需将输出复制并粘贴到新的查询窗口中即可。
It's old-school, but if you only need to do imports once in a while it saves you messing around with reading all the obscure documentation on the 'proper' way to do it.
这是老派,但如果您只需要偶尔进行导入,它可以让您免于阅读所有关于“正确”方法的晦涩文档。
回答by Daren Thomas
Try OpenRowSet. This can be used to import Excel stuff. Excel can open CSV files, so you only need to figure out the correct [ConnectionString][2].
试试OpenRowSet。这可用于导入 Excel 内容。Excel可以打开CSV文件,所以你只需要找出正确的[ConnectionString][2]。
[2]: Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=c:\txtFilesFolder\;Extensions=asc,csv,tab,txt;
[2]: Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=c:\txtFilesFolder\;Extensions=asc,csv,tab,txt;
回答by roundcrisis
Id say use FileHelpers its an open source library
我说使用 FileHelpers 它是一个开源库
回答by kombsh
Firs you need to import CSV file into Data Table
首先需要将 CSV 文件导入数据表
Then you can insert bulk rows using SQLBulkCopy
然后您可以使用 SQLBulkCopy 插入批量行
using System;
using System.Data;
using System.Data.SqlClient;
namespace SqlBulkInsertExample
{
class Program
{
static void Main(string[] args)
{
DataTable prodSalesData = new DataTable("ProductSalesData");
// Create Column 1: SaleDate
DataColumn dateColumn = new DataColumn();
dateColumn.DataType = Type.GetType("System.DateTime");
dateColumn.ColumnName = "SaleDate";
// Create Column 2: ProductName
DataColumn productNameColumn = new DataColumn();
productNameColumn.ColumnName = "ProductName";
// Create Column 3: TotalSales
DataColumn totalSalesColumn = new DataColumn();
totalSalesColumn.DataType = Type.GetType("System.Int32");
totalSalesColumn.ColumnName = "TotalSales";
// Add the columns to the ProductSalesData DataTable
prodSalesData.Columns.Add(dateColumn);
prodSalesData.Columns.Add(productNameColumn);
prodSalesData.Columns.Add(totalSalesColumn);
// Let's populate the datatable with our stats.
// You can add as many rows as you want here!
// Create a new row
DataRow dailyProductSalesRow = prodSalesData.NewRow();
dailyProductSalesRow["SaleDate"] = DateTime.Now.Date;
dailyProductSalesRow["ProductName"] = "Nike";
dailyProductSalesRow["TotalSales"] = 10;
// Add the row to the ProductSalesData DataTable
prodSalesData.Rows.Add(dailyProductSalesRow);
// Copy the DataTable to SQL Server using SqlBulkCopy
using (SqlConnection dbConnection = new SqlConnection("Data Source=ProductHost;Initial Catalog=dbProduct;Integrated Security=SSPI;Connection Timeout=60;Min Pool Size=2;Max Pool Size=20;"))
{
dbConnection.Open();
using (SqlBulkCopy s = new SqlBulkCopy(dbConnection))
{
s.DestinationTableName = prodSalesData.TableName;
foreach (var column in prodSalesData.Columns)
s.ColumnMappings.Add(column.ToString(), column.ToString());
s.WriteToServer(prodSalesData);
}
}
}
}
}
回答by kombsh
u can try this code which is very sweet if you want ,
this will remove unwanted semicolons from your code.
if for example your data is like this :
"Kelly","Reynold","[email protected]"
如果你愿意,你可以试试这个非常可爱的代码,这将从你的代码中删除不需要的分号。例如,如果您的数据是这样的:
"Kelly","Reynold","[email protected]"
Bulk insert test1
from 'c:.txt' with (
fieldterminator ='","'
,rowterminator='\n')
update test1<br>
set name =Substring (name , 2,len(name))
where name like **' "% '**
update test1
set email=substring(email, 1,len(email)-1)
where email like **' %" '**
回答by Cade Roux
You have to watch out with BCP/BULK INSERT because neither BSP or Bulk Insert handle this well if the quoting is not consistent, even with format files (even XML format files don't offer the option) and dummy ["] characters at the beginning and end and using [","] as the separator. Technically CSV files do not need to have ["] characters if there are no embedded [,] characters
您必须注意 BCP/BULK INSERT,因为如果引用不一致,则 BSP 或 Bulk Insert 都不能很好地处理这个问题,即使是格式文件(甚至 XML 格式文件不提供该选项)和虚拟 ["] 字符在开始和结束并使用 [","] 作为分隔符。从技术上讲,如果没有嵌入的 [,] 字符,CSV 文件不需要有 ["] 字符
It is for this reason that comma-delimited files are sometimes referred to as comedy-limited files.
正是出于这个原因,逗号分隔的文件有时被称为喜剧限制的文件。
OpenRowSet will require Excel on the server and could be problematic in 64-bit environments - I know it's problematic using Excel in Jet in 64-bit.
OpenRowSet 将需要服务器上的 Excel,并且在 64 位环境中可能会出现问题 - 我知道在 64 位的 Jet 中使用 Excel 会出现问题。
SSIS is really your best bet if the file is likely to vary from your expectations in the future.
如果文件将来可能与您的期望不同,则 SSIS 确实是您的最佳选择。
回答by Dana
Do you need to do this programmatically, or is it a one-time shot?
您是否需要以编程方式执行此操作,还是一次性拍摄?
Using the Enterprise Manager, right-click Import Data lets you select your delimiter.
使用企业管理器,右键单击“导入数据”可以选择分隔符。
回答by Epaga
Yup, K Richard is right: FIELDTERMINATOR = '","'
是的,K Richard 是对的: FIELDTERMINATOR = '","'
See http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-filefor more info.
有关详细信息,请参阅http://www.sqlteam.com/article/using-bulk-insert-to-load-a-text-file。