C# 将 n 个数据表合并为一个数据表

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/12278978/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-09 22:44:03  来源:igfitidea点击:

Combining n DataTables into a Single DataTable

c#datatable

提问by MoonKnight

All, there are some question on this, but I can't seem to extract enough information to solve the problem for my case. I extract an unknown number of tables into SQL Server 'Tab1', 'Tab2', 'Tab3', ... , 'TabN'. The columns in these tables are different, but the row definitions are the same. I need to pull all the data in from the Server into N DataTables and then combine these to form a single DataTable. What I do currently is

所有,对此都有一些问题,但我似乎无法提取足够的信息来解决我的案例问题。我将未知数量的表提取到 SQL Server 'Tab1'、'Tab2'、'Tab3'、...、'TabN' 中。这些表中的列不同,但行定义相同。我需要将所有数据从 Server 中提取到 N 中DataTable,然后将它们组合成一个DataTable. 我目前做的是

int nTmpVolTabIdx = 1;
strSqlTmp = String.Empty;
using (DataTable dataTableALL = new DataTable())
{
    while (true)
    {
        string strTmpVolName = String.Format("Tab{0}", nTmpVolTabIdx);
        strSqlTmp = String.Format("SELECT * FROM [{0}];", strTmpVolName);

        // Pull the data from 'VolX' into a local DataTable.
        using (DataTable dataTable = UtilsDB.DTFromDB(conn, strTmpVolName, strSqlTmp, false))
        {
            if (dataTable == null)
                break;
            else
                dataTableALL.Merge(dataTable);
        }
        nTmpVolTabIdx++;
    }
    ...
}

This merges the DataTables but they are miss-aligned (padding blank cells onto the appended data set). I could append the columns of the new DataTablevia a loop; but is there an easier/nicer way to do this (perhaps using LINQ)?

这将合并DataTables 但它们未对齐(将空白单元格填充到附加的数据集上)。我可以DataTable通过循环附加新的列;但是有没有更简单/更好的方法来做到这一点(也许使用 LINQ)?

Thanks for your time.

谢谢你的时间。

Edit. To provide the example data sets.

编辑。提供示例数据集。

What I required is

我需要的是

Full DataTable

完整数据表

The individual tables are

单独的表是

Tabs

标签

After the first Merge operation I have the following

在第一次合并操作后,我有以下内容

After Merge

合并后

Thanks again.

再次感谢。

采纳答案by Tim Schmelter

The table has repeating primary keys after the Mergebecause no primary-key was defined. So either specify the PK or try this method here which i've written from scratch(so it's not really tested):

Merge由于未定义主键,该表在 之后具有重复的主键。所以要么指定 PK 要么在这里尝试我从头开始编写的方法(所以它没有真正测试过)

public static DataTable MergeAll(this IList<DataTable> tables, String primaryKeyColumn)
{
    if (!tables.Any())
        throw new ArgumentException("Tables must not be empty", "tables");
    if(primaryKeyColumn != null)
        foreach(DataTable t in tables)
            if(!t.Columns.Contains(primaryKeyColumn))
                throw new ArgumentException("All tables must have the specified primarykey column " + primaryKeyColumn, "primaryKeyColumn");

    if(tables.Count == 1)
        return tables[0];

    DataTable table = new DataTable("TblUnion");
    table.BeginLoadData(); // Turns off notifications, index maintenance, and constraints while loading data
    foreach (DataTable t in tables)
    {
        table.Merge(t); // same as table.Merge(t, false, MissingSchemaAction.Add);
    }
    table.EndLoadData();

    if (primaryKeyColumn != null)
    {
        // since we might have no real primary keys defined, the rows now might have repeating fields
        // so now we're going to "join" these rows ...
        var pkGroups = table.AsEnumerable()
            .GroupBy(r => r[primaryKeyColumn]);
        var dupGroups = pkGroups.Where(g => g.Count() > 1);
        foreach (var grpDup in dupGroups)
        { 
            // use first row and modify it
            DataRow firstRow = grpDup.First();
            foreach (DataColumn c in table.Columns)
            {
                if (firstRow.IsNull(c))
                {
                    DataRow firstNotNullRow = grpDup.Skip(1).FirstOrDefault(r => !r.IsNull(c));
                    if (firstNotNullRow != null)
                        firstRow[c] = firstNotNullRow[c];
                }
            }
            // remove all but first row
            var rowsToRemove = grpDup.Skip(1);
            foreach(DataRow rowToRemove in rowsToRemove)
                table.Rows.Remove(rowToRemove);
        }
    }

    return table;
}

You can call it in this way:

你可以这样调用它:

var tables = new[] { tblA, tblB, tblC };
DataTable TblUnion = tables.MergeAll("c1");

Used this sample data:

使用此示例数据:

var tblA = new DataTable();
tblA.Columns.Add("c1", typeof(int));
tblA.Columns.Add("c2", typeof(int));
tblA.Columns.Add("c3", typeof(string));
tblA.Columns.Add("c4", typeof(char));

var tblB = new DataTable();
tblB.Columns.Add("c1", typeof(int));
tblB.Columns.Add("c5", typeof(int));
tblB.Columns.Add("c6", typeof(string));
tblB.Columns.Add("c7", typeof(char));

var tblC = new DataTable();
tblC.Columns.Add("c1", typeof(int));
tblC.Columns.Add("c8", typeof(int));
tblC.Columns.Add("c9", typeof(string));
tblC.Columns.Add("c10", typeof(char));

tblA.Rows.Add(1, 8500, "abc", 'A');
tblA.Rows.Add(2, 950, "cde", 'B');
tblA.Rows.Add(3, 150, "efg", 'C');
tblA.Rows.Add(4, 850, "ghi", 'D');
tblA.Rows.Add(5, 50, "ijk", 'E');

tblB.Rows.Add(1, 7500, "klm", 'F');
tblB.Rows.Add(2, 900, "mno", 'G');
tblB.Rows.Add(3, 150, "opq", 'H');
tblB.Rows.Add(4, 850, "qrs", 'I');
tblB.Rows.Add(5, 50, "stu", 'J');

tblC.Rows.Add(1, 7500, "uvw", 'K');
tblC.Rows.Add(2, 900, "wxy", 'L');
tblC.Rows.Add(3, 150, "yza", 'M');
tblC.Rows.Add(4, 850, "ABC", 'N');
tblC.Rows.Add(5, 50, "CDE", 'O');

After DataTable.Mergein MergeAll:

之后DataTable.MergeMergeAll

enter image description here

enter image description here

After some modifications to join the rows in MergeAll:

经过一些修改以加入 中的行MergeAll

enter image description here

enter image description here



Update

更新

Since this question arose in one of the comments, if the only relation between two tables is the index of a DataRowin the table and you want to merge both tables according to the index:

由于这个问题出现在其中一个评论中,如果两个表之间的唯一关系是表中 a 的索引,DataRow并且您想根据索引合并两个表:

public static DataTable MergeTablesByIndex(DataTable t1, DataTable t2)
{
    if (t1 == null || t2 == null) throw new ArgumentNullException("t1 or t2", "Both tables must not be null");

    DataTable t3 = t1.Clone();  // first add columns from table1
    foreach (DataColumn col in t2.Columns)
    {
        string newColumnName = col.ColumnName;
        int colNum = 1;
        while (t3.Columns.Contains(newColumnName))
        {
            newColumnName = string.Format("{0}_{1}", col.ColumnName, ++colNum);
        }
        t3.Columns.Add(newColumnName, col.DataType);
    }
    var mergedRows = t1.AsEnumerable().Zip(t2.AsEnumerable(),
        (r1, r2) => r1.ItemArray.Concat(r2.ItemArray).ToArray());
    foreach (object[] rowFields in mergedRows)
        t3.Rows.Add(rowFields);

    return t3;
}

Sample:

样本:

var dt1 = new DataTable();
dt1.Columns.Add("ID", typeof(int));
dt1.Columns.Add("Name", typeof(string));
dt1.Rows.Add(1, "Jon");
var dt2 = new DataTable();
dt2.Columns.Add("Country", typeof(string));
dt2.Rows.Add("US");

var dtMerged = MergeTablesByIndex(dt1, dt2);

The result table contains three columns ID,Name,Countryand a single row: 1 Jon US

结果表包含三列IDNameCountry和单行:1 Jon US