C# 如何使用 FileHelpers 库将大型 SQL Server 表导出为 CSV 文件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19455919/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 15:02:39  来源:igfitidea点击:

How to export large SQL Server table into a CSV file using the FileHelpers library?

c#filehelpers

提问by mircea

I'm looking to export a large SQL Server table into a CSV file using C# and the FileHelpers library. I could consider C# and bcp as well, but I thought FileHelpers would be more flexible than bcp. Speed is not a special requirement. OutOfMemoryExceptionis thrown on the storage.ExtractRecords()when the below code is run (some less essential code has been omitted):

我希望使用 C# 和 FileHelpers 库将大型 SQL Server 表导出到 CSV 文件中。我也可以考虑 C# 和 bcp,但我认为 FileHelpers 会比 bcp 更灵活。速度没有特殊要求。 OutOfMemoryExceptionstorage.ExtractRecords()运行以下代码时抛出(省略了一些不太重要的代码):

  SqlServerStorage storage = new SqlServerStorage(typeof(Order));
    storage.ServerName = "SqlServer"; 
    storage.DatabaseName = "SqlDataBase";
    storage.SelectSql = "select * from Orders";
    storage.FillRecordCallback = new FillRecordHandler(FillRecordOrder);
    Order[] output = null;
    output = storage.ExtractRecords() as Order[];

When the below code is run, 'Timeout expired' is thrown on the link.ExtractToFile():

运行以下代码时,将在以下代码上抛出“超时过期” link.ExtractToFile()

 SqlServerStorage storage = new SqlServerStorage(typeof(Order));
    string sqlConnectionString = "Server=SqlServer;Database=SqlDataBase;Trusted_Connection=True";
    storage.ConnectionString = sqlConnectionString;
    storage.SelectSql = "select * from Orders";
    storage.FillRecordCallback = new FillRecordHandler(FillRecordOrder);
    FileDataLink?link?=?new?FileDataLink(storage);
    link.FileHelperEngine.HeaderText = headerLine;
    link.ExtractToFile("file.csv");

The SQL query run takes more than the default 30 sec and therefore the timeout exception. Unfortunately, I can't find in the FileHelpers docs how to set the SQL Command timeout to a higher value.

SQL 查询运行时间超过默认 30 秒,因此超时异常。不幸的是,我在 FileHelpers 文档中找不到如何将 SQL 命令超时设置为更高的值。

I could consider to loop an SQL select on small data sets until the whole table gets exported, but the procedure would be too complicated. Is there a straightforward method to use FileHelpers on large DB tables export?

我可以考虑在小数据集上循环一个 SQL 选择,直到整个表被导出,但这个过程太复杂了。是否有在大型数据库表导出上使用 FileHelpers 的简单方法?

采纳答案by shamp00

FileHelpers has an async enginewhich is better suited for handling large files. Unfortunately, the FileDataLinkclass does not use it, so there's no easy way to use it with SqlStorage.

FileHelpers 有一个异步引擎,更适合处理大文件。不幸的是,FileDataLink该类不使用它,因此没有简单的方法将它与SqlStorage.

It's not very easy to modify the SQL timeout either. The easiest way would be to copy the code for SqlServerStorageto create your own alternative storage provider and provide replacements for ExecuteAndClose()and ExecuteAndLeaveOpen()which set the timeout on the IDbCommand. (SqlServerStorageis a sealed class, so you cannot just subclass it).

修改 SQL 超时也不是很容易。最简单的方法是复制代码SqlServerStorage以创建您自己的替代存储提供程序并提供替换ExecuteAndClose()ExecuteAndLeaveOpen()IDbCommand. (SqlServerStorage是一个密封类,所以你不能只是继承它)。

You might want to check out ReactiveETLwhich uses the FileHelpers async engine for handling files along with a rewrite of Ayende's RhinoETLusing ReactiveExtensionsto handle large datasets.

您可能想查看ReactiveETL,它使用 FileHelpers 异步引擎处理文件,并使用ReactiveExtensions重写 Ayende 的RhinoETL来处理大型数据集。

回答by Rei Sivan

try this one:

试试这个:

private void exportToCSV()
{
    //Asks the filenam with a SaveFileDialog control.

    SaveFileDialog saveFileDialogCSV = new SaveFileDialog();
    saveFileDialogCSV.InitialDirectory = Application.ExecutablePath.ToString();

    saveFileDialogCSV.Filter = "CSV files (*.csv)|*.csv|All files (*.*)|*.*";
    saveFileDialogCSV.FilterIndex = 1;
    saveFileDialogCSV.RestoreDirectory = true;

    if (saveFileDialogCSV.ShowDialog() == DialogResult.OK)
    {
        // Runs the export operation if the given filenam is valid.
        exportToCSVfile(saveFileDialogCSV.FileName.ToString());
    }
}


 * Exports data to the CSV file.
 */
private void exportToCSVfile(string fileOut)
{
    // Connects to the database, and makes the select command.
    string sqlQuery = "select * from dbo." + this.lbxTables.SelectedItem.ToString();
    SqlCommand command = new SqlCommand(sqlQuery, objConnDB_Auto);

    // Creates a SqlDataReader instance to read data from the table.
    SqlDataReader dr = command.ExecuteReader();

    // Retrives the schema of the table.
    DataTable dtSchema = dr.GetSchemaTable();

    // Creates the CSV file as a stream, using the given encoding.
    StreamWriter sw = new StreamWriter(fileOut, false, this.encodingCSV);

    string strRow; // represents a full row

    // Writes the column headers if the user previously asked that.
    if (this.chkFirstRowColumnNames.Checked)
    {
        sw.WriteLine(columnNames(dtSchema, this.separator));
    }

    // Reads the rows one by one from the SqlDataReader
    // transfers them to a string with the given separator character and
    // writes it to the file.
    while (dr.Read())
    {
        strRow = "";
        for (int i = 0; i < dr.FieldCount; i++)
        {
            switch (Convert.ToString(dr.GetFieldType(i)))
            {
                case "System.Int16":
                    strRow += Convert.ToString(dr.GetInt16(i));
                    break;

                case "System.Int32" :
                    strRow += Convert.ToString(dr.GetInt32(i));
                    break;

                case "System.Int64":
                    strRow += Convert.ToString(dr.GetInt64(i));
                    break;

                case "System.Decimal":
                    strRow += Convert.ToString(dr.GetDecimal(i));
                    break;

                case "System.Double":
                    strRow += Convert.ToString(dr.GetDouble(i));
                    break;

                case "System.Float":
                    strRow += Convert.ToString(dr.GetFloat(i));
                    break;

                case "System.Guid":
                    strRow += Convert.ToString(dr.GetGuid(i));
                    break;

                case "System.String":
                    strRow += dr.GetString(i);
                    break;

                case "System.Boolean":
                    strRow += Convert.ToString(dr.GetBoolean(i));
                    break;

                case "System.DateTime":
                    strRow += Convert.ToString(dr.GetDateTime(i));
                    break;
            }

            if (i < dr.FieldCount - 1)
            {
                strRow += this.separator;
            }
        }
        sw.WriteLine(strRow);
    }


    // Closes the text stream and the database connenction.
    sw.Close();
    dr.Close();

    // Notifies the user.
    MessageBox.Show("ready");
}

回答by Jay Sullivan

Rei Sivan's answer is on the right track, as it will scale well with large files, because it avoids reading the entire table into memory. However, the code can be cleaned up.

Rei Sivan 的答案是正确的,因为它可以很好地扩展大文件,因为它避免将整个表读入内存。但是,可以清理代码。

shamp00's solution requires external libraries.

shamp00 的解决方案需要外部库。

Here is a simpler table-to-CSV-file exporter that will scale well to large files, and does not require any external libraries:

这是一个更简单的表格到 CSV 文件导出器,它可以很好地扩展到大文件,并且不需要任何外部库:

using System;
using System.Collections.Generic;
using System.Data;
using System.Data.SqlClient;
using System.IO;
using System.Linq;

public class TableDumper
{
    public void DumpTableToFile(SqlConnection connection, string tableName, string destinationFile)
    {
        using (var command = new SqlCommand("select * from " + tableName, connection))
        using (var reader = command.ExecuteReader())
        using (var outFile = File.CreateText(destinationFile))
        {
            string[] columnNames = GetColumnNames(reader).ToArray();
            int numFields = columnNames.Length;
            outFile.WriteLine(string.Join(",", columnNames));
            if (reader.HasRows)
            {
                while (reader.Read())
                {
                    string[] columnValues = 
                        Enumerable.Range(0, numFields)
                                  .Select(i => reader.GetValue(i).ToString())
                                  .Select(field => string.Concat("\"", field.Replace("\"", "\"\""), "\""))
                                  .ToArray();
                    outFile.WriteLine(string.Join(",", columnValues));
                }
            }
        }
    }
    private IEnumerable<string> GetColumnNames(IDataReader reader)
    {
        foreach (DataRow row in reader.GetSchemaTable().Rows)
        {
            yield return (string)row["ColumnName"];
        }
    }
}

I wrote this code, and declare it CC0 (public domain).

我写了这段代码,并声明它CC0 (p​​ublic domain)

回答by user3563277

I incorporate 2 The code above. I use this code. I use VS 2010.

我合并了 2 上面的代码。我用这个代码。我使用 VS 2010。

      //this is all lib that i used|||||||||||||||

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using UsbLibrary;
using System.Data;
using System.Data.SqlClient;
using System.Configuration;
using System.Globalization;




        //cocy in a button||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
        SqlConnection _connection = new SqlConnection();
        SqlDataAdapter _dataAdapter = new SqlDataAdapter();
        SqlCommand _command = new SqlCommand();
        DataTable _dataTable = new DataTable();

        _connection = new SqlConnection();
        _dataAdapter = new SqlDataAdapter();
        _command = new SqlCommand();
        _dataTable = new DataTable();

        //dbk is my database name that you can change it to your database name
        _connection.ConnectionString = "Data Source=.;Initial Catalog=dbk;Integrated Security=True";
        _connection.Open();

        SaveFileDialog saveFileDialogCSV = new SaveFileDialog();
        saveFileDialogCSV.InitialDirectory = Application.ExecutablePath.ToString();

        saveFileDialogCSV.Filter = "CSV files (*.csv)|*.csv|All files (*.*)|*.*";
        saveFileDialogCSV.FilterIndex = 1;
        saveFileDialogCSV.RestoreDirectory = true;

        string   path_csv="";
        if (saveFileDialogCSV.ShowDialog() == DialogResult.OK)
        {
            // Runs the export operation if the given filenam is valid.
            path_csv=   saveFileDialogCSV.FileName.ToString();
        }


             DumpTableToFile(_connection, "tbl_trmc", path_csv);

        }
        //end of code in button|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||


    public void DumpTableToFile(SqlConnection connection, string tableName, string destinationFile)
    {
        using (var command = new SqlCommand("select * from " + tableName, connection))
        using (var reader = command.ExecuteReader())
        using (var outFile = System.IO.File.CreateText(destinationFile))
        {
            string[] columnNames = GetColumnNames(reader).ToArray();
            int numFields = columnNames.Length;
            outFile.WriteLine(string.Join(",", columnNames));
            if (reader.HasRows)
            {
                while (reader.Read())
                {
                    string[] columnValues =
                        Enumerable.Range(0, numFields)
                                  .Select(i => reader.GetValue(i).ToString())
                                  .Select(field => string.Concat("\"", field.Replace("\"", "\"\""), "\""))
                                  .ToArray();
                    outFile.WriteLine(string.Join(",", columnValues));
                }
            }
        }
    }
    private IEnumerable<string> GetColumnNames(IDataReader reader)
    {
        foreach (DataRow row in reader.GetSchemaTable().Rows)
        {
            yield return (string)row["ColumnName"];
        }
    }

回答by chorbs

Very appreciative of Jay Sullivan's answer -- was very helpful for me.

非常感谢 Jay Sullivan 的回答——对我很有帮助。

Building on that, I observed that in his solution the string formatting of varbinary and string data types was not good -- varbinary fields would come out as literally "System.Byte"or something like that, while datetime fields would be formatted MM/dd/yyyy hh:mm:ss tt, which is not desirable for me.

在此基础上,我观察到在他的解决方案中,varbinary 和 string 数据类型的字符串格式并不好——varbinary 字段会以字面意思"System.Byte"或类似的形式出现,而 datetime 字段会被格式化MM/dd/yyyy hh:mm:ss tt,这对我来说是不可取的。

Below I is my hacked-together solution which converts to string differently based on data type. It is uses nested ternary operators, but it works!

下面我是我的 hacked-together 解决方案,它根据数据类型以不同的方式转换为字符串。它使用嵌套的三元运算符,但它有效!

Hope it is helpful for someone.

希望它对某人有帮助。

public static void DumpTableToFile(SqlConnection connection, Dictionary<string, string> cArgs)
{
    string query = "SELECT ";
    string z = "";
    if (cArgs.TryGetValue("top_count", out z))
    {
        query += string.Format("TOP {0} ", z);
    }
    query += string.Format("* FROM {0} (NOLOCK) ", cArgs["table"]);
    string lower_bound = "", upper_bound = "", column_name = "";
    if (cArgs.TryGetValue("lower_bound", out lower_bound) && cArgs.TryGetValue("column_name", out column_name))
    {
        query += string.Format("WHERE {0} >= {1} ", column_name, lower_bound);
        if (cArgs.TryGetValue("upper_bound", out upper_bound))
        {
            query += string.Format("AND {0} < {1} ", column_name, upper_bound);
        }
    }
    Console.WriteLine(query);
    Console.WriteLine("");
    using (var command = new SqlCommand(query, connection))
    using (var reader = command.ExecuteReader())
    using (var outFile = File.CreateText(cArgs["out_file"]))
    {
        string[] columnNames = GetColumnNames(reader).ToArray();
        int numFields = columnNames.Length;
        Console.WriteLine(string.Join(",", columnNames));
        Console.WriteLine("");
        if (reader.HasRows)
        {
            Type datetime_type = Type.GetType("System.DateTime");
            Type byte_arr_type = Type.GetType("System.Byte[]");
            string format = "yyyy-MM-dd HH:mm:ss.fff";
            int ii = 0;
            while (reader.Read())
            {
                ii += 1;
                string[] columnValues =
                    Enumerable.Range(0, numFields)
                        .Select(i => reader.GetValue(i).GetType()==datetime_type?((DateTime) reader.GetValue(i)).ToString(format):(reader.GetValue(i).GetType() == byte_arr_type? String.Concat(Array.ConvertAll((byte[]) reader.GetValue(i), x => x.ToString("X2"))) :reader.GetValue(i).ToString()))
                        ///.Select(field => string.Concat("\"", field.Replace("\"", "\"\""), "\""))
                        .Select(field => field.Replace("\t", " "))
                                .ToArray();
                outFile.WriteLine(string.Join("\t", columnValues));
                if (ii % 100000 == 0)
                {
                    Console.WriteLine("row {0}", ii);
                }
            }
        }
    }
}
public static IEnumerable<string> GetColumnNames(IDataReader reader)
{
    foreach (DataRow row in reader.GetSchemaTable().Rows)
    {
        yield return (string)row["ColumnName"];
    }
}