.net 为什么 DataTable 比 DataReader 快

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13650443/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 16:45:29  来源:igfitidea点击:

Why is DataTable faster than DataReader

.netdata-access-layersqldatareader

提问by Shai Cohen

So we have had a heated debate at work as to which DataAccess route to take: DataTable or DataReader.

因此,我们在工作中就采用哪种 DataAccess 路线进行了激烈的辩论:DataTable 还是 DataReader。

DISCLAIMERI am on the DataReader side and these results have shaken my world.

免责声明我站在 DataReader 一方,这些结果震撼了我的世界。

We ended up writing some benchmarks to test the speed differences. It was generally agreed that a DataReader is faster, but we wanted to see how much faster.

我们最终编写了一些基准测试来测试速度差异。人们普遍认为 DataReader 更快,但我们想看看有多快。

The results surprised us. The DataTable was consistently faster than the DataReader. Approaching twice as fast sometimes.

结果令我们吃惊。DataTable 始终比 DataReader 快。有时接近两倍。

So I turn to you, members of SO. Why, when most of the documentation and even Microsoft, state that a DataReader is faster are our test showing otherwise.

所以我转向你们,SO 的成员。为什么,当大多数文档甚至 Microsoft 都说 DataReader 更快时,我们的测试却显示出相反的情况。

And now for the code:

现在是代码:

The test harness:

测试线束:

    private void button1_Click(object sender, EventArgs e)
    {
        System.Diagnostics.Stopwatch sw = new System.Diagnostics.Stopwatch();
        sw.Start();

        DateTime date = DateTime.Parse("01/01/1900");

        for (int i = 1; i < 1000; i++)
        {

            using (DataTable aDataTable = ArtifactBusinessModel.BusinessLogic.ArtifactBL.RetrieveDTModified(date))
            {
            }
        }
        sw.Stop();
        long dataTableTotalSeconds = sw.ElapsedMilliseconds;

        sw.Restart();


        for (int i = 1; i < 1000; i++)
        {
            List<ArtifactBusinessModel.Entities.ArtifactString> aList = ArtifactBusinessModel.BusinessLogic.ArtifactBL.RetrieveModified(date);

        }

        sw.Stop();

        long listTotalSeconds = sw.ElapsedMilliseconds;

        MessageBox.Show(String.Format("list:{0}, table:{1}", listTotalSeconds, dataTableTotalSeconds));
    }

This is the DAL for the DataReader:

这是 DataReader 的 DAL:

        internal static List<ArtifactString> RetrieveByModifiedDate(DateTime modifiedLast)
        {
            List<ArtifactString> artifactList = new List<ArtifactString>();

            try
            {
                using (SqlConnection conn = SecuredResource.GetSqlConnection("Artifacts"))
                {
                    using (SqlCommand command = new SqlCommand("[cache].[Artifacts_SEL_ByModifiedDate]", conn))
                    {
                        command.CommandType = CommandType.StoredProcedure;
                        command.Parameters.Add(new SqlParameter("@LastModifiedDate", modifiedLast));
                        using (SqlDataReader reader = command.ExecuteReader())
                        {
                            int formNumberOrdinal = reader.GetOrdinal("FormNumber");
                            int formOwnerOrdinal = reader.GetOrdinal("FormOwner");
                            int descriptionOrdinal = reader.GetOrdinal("Description");
                            int descriptionLongOrdinal = reader.GetOrdinal("DescriptionLong");
                            int thumbnailURLOrdinal = reader.GetOrdinal("ThumbnailURL");
                            int onlineSampleURLOrdinal = reader.GetOrdinal("OnlineSampleURL");
                            int lastModifiedMetaDataOrdinal = reader.GetOrdinal("LastModifiedMetaData");
                            int lastModifiedArtifactFileOrdinal = reader.GetOrdinal("LastModifiedArtifactFile");
                            int lastModifiedThumbnailOrdinal = reader.GetOrdinal("LastModifiedThumbnail");
                            int effectiveDateOrdinal = reader.GetOrdinal("EffectiveDate");
                            int viewabilityOrdinal = reader.GetOrdinal("Viewability");
                            int formTypeOrdinal = reader.GetOrdinal("FormType");
                            int inventoryTypeOrdinal = reader.GetOrdinal("InventoryType");
                            int createDateOrdinal = reader.GetOrdinal("CreateDate");

                            while (reader.Read())
                            {
                                ArtifactString artifact = new ArtifactString();
                                ArtifactDAL.Map(formNumberOrdinal, formOwnerOrdinal, descriptionOrdinal, descriptionLongOrdinal, formTypeOrdinal, inventoryTypeOrdinal, createDateOrdinal, thumbnailURLOrdinal, onlineSampleURLOrdinal, lastModifiedMetaDataOrdinal, lastModifiedArtifactFileOrdinal, lastModifiedThumbnailOrdinal, effectiveDateOrdinal, viewabilityOrdinal, reader, artifact);
                                artifactList.Add(artifact);
                            }
                        }
                    }
                }
            }
            catch (ApplicationException)
            {
                throw;
            }
            catch (Exception e)
            {
                string errMsg = String.Format("Error in ArtifactDAL.RetrieveByModifiedDate. Date: {0}", modifiedLast);
                Logging.Log(Severity.Error, errMsg, e);
                throw new ApplicationException(errMsg, e);
            }

            return artifactList;
        }
    internal static void Map(int? formNumberOrdinal, int? formOwnerOrdinal, int? descriptionOrdinal, int? descriptionLongOrdinal, int? formTypeOrdinal, int? inventoryTypeOrdinal, int? createDateOrdinal,
        int? thumbnailURLOrdinal, int? onlineSampleURLOrdinal, int? lastModifiedMetaDataOrdinal, int? lastModifiedArtifactFileOrdinal, int? lastModifiedThumbnailOrdinal,
        int? effectiveDateOrdinal, int? viewabilityOrdinal, IDataReader dr, ArtifactString entity)
    {

            entity.FormNumber = dr[formNumberOrdinal.Value].ToString();
            entity.FormOwner = dr[formOwnerOrdinal.Value].ToString();
            entity.Description = dr[descriptionOrdinal.Value].ToString();
            entity.DescriptionLong = dr[descriptionLongOrdinal.Value].ToString();
            entity.FormType = dr[formTypeOrdinal.Value].ToString();
            entity.InventoryType = dr[inventoryTypeOrdinal.Value].ToString();
            entity.CreateDate = DateTime.Parse(dr[createDateOrdinal.Value].ToString());
            entity.ThumbnailURL = dr[thumbnailURLOrdinal.Value].ToString();
            entity.OnlineSampleURL = dr[onlineSampleURLOrdinal.Value].ToString();
            entity.LastModifiedMetaData = dr[lastModifiedMetaDataOrdinal.Value].ToString();
            entity.LastModifiedArtifactFile = dr[lastModifiedArtifactFileOrdinal.Value].ToString();
            entity.LastModifiedThumbnail = dr[lastModifiedThumbnailOrdinal.Value].ToString();
            entity.EffectiveDate = dr[effectiveDateOrdinal.Value].ToString();
            entity.Viewability = dr[viewabilityOrdinal.Value].ToString();
    }

This is the DAL for the DataTable:

这是 DataTable 的 DAL:

        internal static DataTable RetrieveDTByModifiedDate(DateTime modifiedLast)
        {
            DataTable dt= new DataTable("Artifacts");

            try
            {
                using (SqlConnection conn = SecuredResource.GetSqlConnection("Artifacts"))
                {
                    using (SqlCommand command = new SqlCommand("[cache].[Artifacts_SEL_ByModifiedDate]", conn))
                    {
                        command.CommandType = CommandType.StoredProcedure;
                        command.Parameters.Add(new SqlParameter("@LastModifiedDate", modifiedLast));

                        using (SqlDataAdapter da = new SqlDataAdapter(command))
                        {
                            da.Fill(dt);
                        }
                    }
                }
            }
            catch (ApplicationException)
            {
                throw;
            }
            catch (Exception e)
            {
                string errMsg = String.Format("Error in ArtifactDAL.RetrieveByModifiedDate. Date: {0}", modifiedLast);
                Logging.Log(Severity.Error, errMsg, e);
                throw new ApplicationException(errMsg, e);
            }

            return dt;
        }

The results:

结果:

For 10 iterations within the Test Harness

对于测试框架内的 10 次迭代

For 10 iterations within the test harness

对于测试工具内的 10 次迭代

For 1000 iterations within the Test Harness

对于测试框架内的 1000 次迭代

enter image description here

在此处输入图片说明

These results are the second run, to mitigate the differences due to creating the connection.

这些结果是第二次运行,以减轻由于创建连接而产生的差异。

回答by Joel Coehoorn

I see three issues:

我看到三个问题:

  1. the way you use a DataReader negates it's big single-item-in-memory advantage by converting it to list,
  2. you're running the benchmark in an environment that differs significantly from production in a way that favors the DataTable, and
  3. you're spending time converting DataReader record to Artifact objects that is not duplicated in the DataTable code.
  1. 您使用 DataReader 的方式通过将其转换为列表来否定它在内存中的单项优势,
  2. 您正在以有利于 DataTable 的方式在与生产环境显着不同的环境中运行基准测试,并且
  3. 您花时间将 DataReader 记录转换为在 DataTable 代码中没有重复的 Artifact 对象。

The main advantage of a DataReader is that you don't have to load everything into memory at once. This should be a huge advantage for DataReader in web apps, where memory, rather than cpu, is often the bottleneck, but by adding each row to a generic list you've negated this. That also means that even after you change your code to only use one record at a time, the difference might not show up on your benchmarks because you're running them on a system with lot of free memory, which will favor the DataTable. Also, the DataReader version is spending time parsing the results into Artifact objects that the DataTable has not done yet.

DataReader 的主要优点是您不必一次将所有内容加载到内存中。对于 Web 应用程序中的 DataReader 来说,这应该是一个巨大的优势,其中内存而不是 CPU 通常是瓶颈,但是通过将每一行添加到通用列表中,您已经否定了这一点。这也意味着即使您将代码更改为一次仅使用一条记录,差异也可能不会出现在您的基准测试中,因为您在具有大量可用内存的系统上运行它们,这将有利于 DataTable。此外,DataReader 版本花费时间将结果解析为 DataTable 尚未完成的 Artifact 对象。

To fix the DataReader usage issue, change List<ArtifactString>to IEnumerable<ArtifactString>everywhere, and in your DataReader DAL change this line:

要修复 DataReader 使用问题,请更改List<ArtifactString>IEnumerable<ArtifactString>无处不在,并在您的 DataReader DAL 中更改此行:

artifactList.Add(artifact);

to this:

对此:

yield return artifact;

This means you also need to add code that iterates over the results to your DataReader test harness to keep things fair.

这意味着您还需要将迭代结果的代码添加到 DataReader 测试工具以保持公平。

I'm not sure how to adjust the benchmark to create a more typical scenario that is fair to both DataTable and DataReader, except to build two versions of your page, and serve up each version for an hour under a similar production-level load so that we have real memory pressure... do some real A/B testing. Also, make sure you cover converting the DataTable rows to Artifacts... and if the argument is that you need to do this for a DataReader, but not for a DataTable, that is just plain wrong.

我不知道如何调整基准以创建一个对 DataTable 和 DataReader 都公平的更典型的场景,除了构建页面的两个版本,并在类似的生产级负载下为每个版本提供一个小时,以便我们有真正的内存压力……做一些真正的 A/B 测试。另外,请确保您涵盖了将 DataTable 行转换为 Artifacts... 如果论点是您需要为 DataReader 执行此操作,而不是为 DataTable 执行此操作,那完全是错误的。

回答by Joe

SqlDataAdapter.Fillcalls SqlCommand.ExecuteReader with CommandBehavior.SequentialAccessset. Maybe that's enough to make the difference.

SqlDataAdapter.Fill使用CommandBehavior.SequentialAccessset调用 SqlCommand.ExecuteReader 。也许这足以改变。

As an aside, I see your IDbReaderimplementation caches the ordinals of each field for performance reasons. An alternative to this approach is to use the DbEnumeratorclass.

IDbReader顺便说一句,我看到您的实现出于性能原因缓存了每个字段的序数。这种方法的替代方法是使用DbEnumerator类。

DbEnumeratorcaches a field name -> ordinal dictionary internally, so gives you much of the performance benefit of using ordinals with the simplicity of using field names:

DbEnumerator在内部缓存字段名称 -> 序数字典,因此为您提供了使用序数的许多性能优势以及使用字段名称的简单性:

foreach(IDataRecord record in new DbEnumerator(reader))
{
    artifactList.Add(new ArtifactString() {
        FormNumber = (int) record["FormNumber"],
        FormOwner = (int) record["FormOwner"],
        ...
    });
}

or even:

甚至:

return new DbEnumerator(reader)
    .Select(record => new ArtifactString() {
        FormNumber = (int) record["FormNumber"],
        FormOwner = (int) record["FormOwner"],
        ...
      })
    .ToList();

回答by granadaCoder

2 things could be slowing you down.

有两件事可能会让你慢下来。

First, I wouldn't do a "find ordinal by name" for each column, if you're interested in performance. Note, the "layout" class below to take care of this lookup. And the layout providers later readability, instead of using "0", "1", "2", etc. And it allows me to code to an Interface (IDataReader) instead of the Concrete.

首先,如果您对性能感兴趣,我不会为每列执行“按名称查找序数”。请注意,下面的“布局”类负责此查找。并且布局提供者以后的可读性,而不是使用“0”、“1”、“2”等。它允许我编码到接口(IDataReader)而不是具体。

Second. You're using the ".Value" property. (and I would think this does make a difference)

第二。您正在使用“.Value”属性。(我认为这确实有所作为)

You'll get better results (IMHO) if you use the concrete datatype "getters".

如果您使用具体数据类型“getter”,您将获得更好的结果(恕我直言)。

GetString, GetDateTime, GetInt32, etc,etc.

GetString、GetDateTime、GetInt32 等。

Here is my typical IDataReader to DTO/POCO code.

这是我典型的 IDataReader 到 DTO/POCO 代码。

[Serializable]
public partial class Employee
{
    public int EmployeeKey { get; set; }                   
    public string LastName { get; set; }                   
    public string FirstName { get; set; }   
    public DateTime HireDate  { get; set; }  
}

[Serializable]
public class EmployeeCollection : List<Employee>
{
}   

internal static class EmployeeSearchResultsLayouts
{
    public static readonly int EMPLOYEE_KEY = 0;
    public static readonly int LAST_NAME = 1;
    public static readonly int FIRST_NAME = 2;
    public static readonly int HIRE_DATE = 3;
}


    public EmployeeCollection SerializeEmployeeSearchForCollection(IDataReader dataReader)
    {
        Employee item = new Employee();
        EmployeeCollection returnCollection = new EmployeeCollection();
        try
        {

            int fc = dataReader.FieldCount;//just an FYI value

            int counter = 0;//just an fyi of the number of rows

            while (dataReader.Read())
            {

                if (!(dataReader.IsDBNull(EmployeeSearchResultsLayouts.EMPLOYEE_KEY)))
                {
                    item = new Employee() { EmployeeKey = dataReader.GetInt32(EmployeeSearchResultsLayouts.EMPLOYEE_KEY) };

                    if (!(dataReader.IsDBNull(EmployeeSearchResultsLayouts.LAST_NAME)))
                    {
                        item.LastName = dataReader.GetString(EmployeeSearchResultsLayouts.LAST_NAME);
                    }

                    if (!(dataReader.IsDBNull(EmployeeSearchResultsLayouts.FIRST_NAME)))
                    {
                        item.FirstName = dataReader.GetString(EmployeeSearchResultsLayouts.FIRST_NAME);
                    }

                    if (!(dataReader.IsDBNull(EmployeeSearchResultsLayouts.HIRE_DATE)))
                    {
                        item.HireDate = dataReader.GetDateTime(EmployeeSearchResultsLayouts.HIRE_DATE);
                    }


                    returnCollection.Add(item);
                }

                counter++;
            }

            return returnCollection;

        }
        //no catch here... see  http://blogs.msdn.com/brada/archive/2004/12/03/274718.aspx
        finally
        {
            if (!((dataReader == null)))
            {
                try
                {
                    dataReader.Close();
                }
                catch
                {
                }
            }
        }
    }

回答by Geoff

I don't think it will account for all the difference, but try something like this to eliminate some of the extra variables and function calls:

我不认为它会解释所有的差异,但尝试这样的事情来消除一些额外的变量和函数调用:

using (SqlDataReader reader = command.ExecuteReader())
{
    while (reader.Read())
    {
        artifactList.Add(new ArtifactString
        {
            FormNumber = reader["FormNumber"].ToString(),
            //etc
        });
     }
}