C# 为什么字典比列表快这么多?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/16977694/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-10 08:16:23  来源:igfitidea点击:

Why is dictionary so much faster than list?

c#.netperformance

提问by Unforgiven

I am testing the speed of getting data from Dictionary VS list.
I've used this code to test :

我正在测试从 Dictionary VS 列表中获取数据的速度。
我用这段代码来测试:

    internal class Program
{
    private static void Main(string[] args)
    {
        var stopwatch = new Stopwatch();
        List<Grade> grades = Grade.GetData().ToList();
        List<Student> students = Student.GetStudents().ToList();

        stopwatch.Start();
        foreach (Student student in students)
        {
            student.Grade = grades.Single(x => x.StudentId == student.Id).Value;
        }
        stopwatch.Stop();
        Console.WriteLine("Using list {0}", stopwatch.Elapsed);
        stopwatch.Reset();
        students = Student.GetStudents().ToList();
        stopwatch.Start();
        Dictionary<Guid, string> dic = Grade.GetData().ToDictionary(x => x.StudentId, x => x.Value);
        foreach (Student student in students)
        {
            student.Grade = dic[student.Id];
        }
        stopwatch.Stop();
        Console.WriteLine("Using dictionary {0}", stopwatch.Elapsed);
        Console.ReadKey();
    }
}

public class GuidHelper
{
    public static List<Guid> ListOfIds=new List<Guid>();

    static GuidHelper()
    {
        for (int i = 0; i < 10000; i++)
        {
            ListOfIds.Add(Guid.NewGuid());
        }
    }
}


public class Grade
{
    public Guid StudentId { get; set; }
    public string Value { get; set; }

    public static IEnumerable<Grade> GetData()
    {
        for (int i = 0; i < 10000; i++)
        {
            yield return new Grade
                             {
                                 StudentId = GuidHelper.ListOfIds[i], Value = "Value " + i
                             };
        }
    }
}

public class Student
{
    public Guid Id { get; set; }
    public string Name { get; set; }
    public string Grade { get; set; }

    public static IEnumerable<Student> GetStudents()
    {
        for (int i = 0; i < 10000; i++)
        {
            yield return new Student
                             {
                                 Id = GuidHelper.ListOfIds[i],
                                 Name = "Name " + i
                             };
        }
    }
}

There is list of students and grades in memory they have StudentId in common.
In first way I tried to find Grade of a student using LINQ on a list that takes near 7 seconds on my machine and in another way first I converted List into dictionary then finding grades of student from dictionary using key that takes less than a second . enter image description here

记忆中有学生和成绩的列表,他们有共同的 StudentId。
在第一种方式中,我尝试使用 LINQ 在我的机器上需要近 7 秒的列表上查找学生的成绩,而在另一种方式中,首先我将 List 转换为字典,然后使用不到一秒的键从字典中查找学生的成绩。 在此处输入图片说明

采纳答案by Patashu

When you do this:

当你这样做时:

student.Grade = grades.Single(x => x.StudentId == student.Id).Value;

student.Grade = grades.Single(x => x.StudentId == student.Id).Value;

As written it has to enumerate the entire Listuntil it finds the entry in the List that has the correct studentId (does entry 0 match the lambda? No... Does entry 1 match the lambda? No... etc etc). This is O(n). Since you do it once for every student, it is O(n^2).

正如所写的那样,它必须枚举整个List列表,直到它在 List 中找到具有正确 studentId 的条目(条目 0 是否与 lambda 匹配?否...条目 1 是否与 lambda 匹配?否...等)。这是 O(n)。因为你为每个学生做一次,所以它是 O(n^2)。

However when you do this:

但是,当您这样做时:

student.Grade = dic[student.Id];

student.Grade = dic[student.Id];

If you want to find a certain element by key in a dictionary, it can instantly jump to where it is in the dictionary - this is O(1). O(n) for doing it for every student. (If you want to know how this is done - Dictionary runs a mathematical operation on the key, which turns it into a value that is a place inside the dictionary, which is the same place it put it when it was inserted)

如果你想在字典中按键查找某个元素,它可以立即跳转到它在字典中的位置——这是 O(1)。O(n) 为每个学生做这件事。(如果您想知道这是如何完成的 - 字典对键运行数学运算,将其转换为字典内的值,即插入时放置它的位置)

So, dictionary is faster because you used a better algorithm.

因此,字典更快,因为您使用了更好的算法。

回答by Ron.B.I

When using Dictionary you are using a keyto retrieve your information, which enables it to find it more efficiently, with List you are using SingleLinq expression, which since it is a list, has no other option other than to look in entire listfor wanted the item.

使用 Dictionary 时,您使用的是一个来检索您的信息,这使它能够更有效地找到它,而使用 List 您使用的是SingleLinq 表达式,因为它是一个列表,除了在整个列表中查找想要的信息之外别无选择项目。

回答by Quinton Bernhardt

Dictionary uses hashing to search for the data. Each item in the dictionary is stored in buckets of items that contain the same hash. It's a lot quicker.

字典使用散列来搜索数据。字典中的每个项目都存储在包含相同散列的项目桶中。它快了很多。

Try sorting your list, it will be a a bit quicker then.

尝试对您的列表进行排序,这样会更快一些。

回答by JeffRSon

Dictionary is based on a hash table which is a rather efficient algorithm to look up things. In a list you have to go element by element in order to find something.

字典基于哈希表,这是一种相当有效的查找事物的算法。在列表中,您必须逐个元素地查找某些内容。

It's all a matter of data organization...

这完全是数据组织的问题...

回答by Erik Funkenbusch

The reason is because a dictionary is a lookup, while a list is an iteration.

原因是因为字典是查找,而列表是迭代。

Dictionary uses a hash lookup, while your list requires walking through the list until it finds the result from beginning to the result each time.

字典使用哈希查找,而您的列表需要遍历列表,直到每次从头到结果找到结果。

to put it another way. The list will be faster than the dictionary on the first item, because there's nothing to look up. it's the first item, boom.. it's done. but the second time the list has to look through the first item, then the second item. The third time through it has to look through the first item, then the second item, then the third item.. etc..

换一种方式。在第一项上,列表会比字典快,因为没有什么可查的。这是第一个项目,boom.. 完成了。但是第二次列表必须查看第一项,然后是第二项。第三次通过它必须查看第一个项目,然后是第二个项目,然后是第三个项目......等等。

So each iteration the lookup takes more and more time. The larger the list, the longer it takes. While the dictionary is always a more or less fixed lookup time (it also increases as the dictionary gets larger, but at a much slower pace, so by comparison it's almost fixed).

所以每次迭代查找需要越来越多的时间。列表越大,需要的时间越长。虽然字典总是或多或少是固定的查找时间(它也会随着字典变大而增加,但速度要慢得多,因此相比之下,它几乎是固定的)。

回答by aiapatag

When it comes to lookup of data, a keyed collection is always faster than a non-keyed collection. This is because a non-keyed collection will have to enumerate its elements to find what you are looking for. While in a keyed collection you can just access the element directly via the key.

在查找数据时,键控集合总是比非键控集合快。这是因为非键控集合必须枚举其元素才能找到您要查找的内容。在键控集合中,您可以直接通过键访问元素。

These are some nice articles for comparing list to dictionary.

这些是将列表与字典进行比较的一些不错的文章。

Here. And this one.

在这里。而这一个

回答by Nobilis

A dictionary uses a hash table, it is a great data structure as it maps an input to a corresponding output almost instantaneously, it has a complexity of O(1) as already pointed out which means more or less immediate retrieval.

字典使用哈希表,它是一种很棒的数据结构,因为它几乎立即将输入映射到相应的输出,正如已经指出的那样,它的复杂度为 O(1),这意味着或多或少是立即检索。

The cons of it is that for the sake of performance you need lots of space in advance (depending on the implementation be it separate chaining or linear/quadratic probing you may need at least as much as you're planning to store, probably double in the latter case) and you need a good hashing algorithm that maps uniquely your input ("John Smith") to a corresponding output such as a position in an array (hash_array[34521]).

它的缺点是,为了提高性能,您需要提前提供大量空间(取决于实现,无论是单独的链接还是线性/二次探测,您可能需要至少与您计划存储的一样多,可能加倍后一种情况),并且您需要一个良好的散列算法,将您的输入 ( "John Smith")唯一地映射到相应的输出,例如数组 ( hash_array[34521]) 中的位置。

Also listing the entries in a sorted order is a problem. If I may quote Wikipedia:

按排序顺序列出条目也是一个问题。如果我可以引用维基百科:

Listing all n entries in some specific order generally requires a separate sorting step, whose cost is proportional to log(n) per entry.

以某种特定顺序列出所有 n 个条目通常需要一个单独的排序步骤,其成本与每个条目的 log(n) 成正比。

Have a read on linear probingand separate chainingfor some gorier details :)

阅读有关线性探测单独链接的一些更详细的信息:)

回答by howserss

From MSDN - Dictionary mentions close to O(1) but I think it depends on the types involved.

从 MSDN - 字典提到接近 O(1) 但我认为这取决于所涉及的类型。

The Dictionary(TKey,TValue) generic class provides a mapping from a set of keys to a set of values. Each addition to the dictionary consists of a value and its associated key. Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table.

Dictionary(TKey,TValue) 泛型类提供从一组键到一组值的映射。字典中的每个添加项都包含一个值及其关联的键。使用键检索值非常快,接近 O(1),因为 Dictionary 类是作为哈希表实现的。

Note: The speed of retrieval depends on the quality of the hashing algorithm of the type specified for TKey.

注意:检索速度取决于为 TKey 指定的类型的散列算法的质量。

List(TValue) does not implement a hash lookup so it is sequential and the performance is O(n). It also depends on the types involved and boxing/unboxing needs to be considered.

List(TValue) 没有实现哈希查找,所以它是顺序的,性能是 O(n)。它还取决于所涉及的类型,需要考虑装箱/拆箱。