C# 哪个更有效:Dictionary TryGetValue 或 ContainsKey+Item?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/9382681/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-09 07:08:05  来源:igfitidea点击:

What is more efficient: Dictionary TryGetValue or ContainsKey+Item?

c#performancedictionary

提问by Rado

From MSDN's entry on Dictionary.TryGetValue Method:

从 MSDN 上的Dictionary.TryGetValue Method条目:

This method combines the functionality of the ContainsKey method and the Item property.

If the key is not found, then the value parameter gets the appropriate default value for the value type TValue; for example, 0 (zero) for integer types, false for Boolean types, and null for reference types.

Use the TryGetValue method if your code frequently attempts to access keys that are not in the dictionary. Using this method is more efficient than catching the KeyNotFoundException thrown by the Item property.

This method approaches an O(1) operation.

此方法结合了 ContainsKey 方法和 Item 属性的功能。

如果未找到键,则 value 参数为值类型 TValue 获取适当的默认值;例如,整数类型为 0(零),布尔类型为 false,引用类型为 null。

如果您的代码经常尝试访问不在字典中的键,请使用 TryGetValue 方法。使用此方法比捕获 Item 属性抛出的 KeyNotFoundException 更有效。

该方法接近 O(1) 操作。

From the description, it's not clear if it is more efficient or just more convenient than calling ContainsKey and then doing the lookup. Does the implementation of TryGetValuejust call ContainsKey and then Item or is actually more efficient than that by doing a single lookup?

从描述中,不清楚它是否比调用 ContainsKey 然后进行查找更有效或更方便。是TryGetValue只调用 ContainsKey 然后调用 Item 的实现还是实际上比通过单次查找更有效?

In other words, what is more efficient (i.e. which one performs less lookups):

换句话说,什么更有效(即哪个执行更少的查找):

Dictionary<int,int> dict;
//...//
int ival;
if(dict.ContainsKey(ikey))
{
  ival = dict[ikey];
}
else
{
  ival = default(int);
}

or

或者

Dictionary<int,int> dict;
//...//
int ival;
dict.TryGetValue(ikey, out ival);

Note: I am not looking for a benchmark!

注意:我不是在寻找基准!

采纳答案by Reed Copsey

TryGetValuewill be faster.

TryGetValue会更快。

ContainsKeyuses the same check as TryGetValue, which internally refers to the actual entry location. The Itemproperty actually has nearly identical code functionality as TryGetValue, except that it will throw an exception instead of returning false.

ContainsKey使用与 相同的检查TryGetValue,它在内部指的是实际入口位置。该Item属性实际上具有与 几乎相同的代码功能TryGetValue,只是它会抛出异常而不是返回 false。

Using ContainsKeyfollowed by the Itembasically duplicates the lookup functionality, which is the bulk of the computation in this case.

使用ContainsKey后跟Item基本重复查找功能,这是本例中的大部分计算。

回答by CodesInChaos

Why don't you test it?

你为什么不测试一下?

But I'm pretty sure that TryGetValueis faster, because it only does one lookup. Of course this isn't guaranteed, i.e. different implementations might have different performance characteristics.

但我很确定那TryGetValue会更快,因为它只进行一次查找。当然,这并不能保证,即不同的实现可能具有不同的性能特征。

The way I'd implement a dictionary is by creating an internal Findfunction that finds the slot for an item, and then build the rest on top of that.

我实现字典的方法是创建一个内部Find函数来查找项目的插槽,然后在此基础上构建其余部分。

回答by dasblinkenlight

A quick benchmark shows that TryGetValuehas a slight edge:

一个快速的基准测试表明它TryGetValue有一点优势:

    static void Main() {
        var d = new Dictionary<string, string> {{"a", "b"}};
        var start = DateTime.Now;
        for (int i = 0; i != 10000000; i++) {
            string x;
            if (!d.TryGetValue("a", out x)) throw new ApplicationException("Oops");
            if (d.TryGetValue("b", out x)) throw new ApplicationException("Oops");
        }
        Console.WriteLine(DateTime.Now-start);
        start = DateTime.Now;
        for (int i = 0; i != 10000000; i++) {
            string x;
            if (d.ContainsKey("a")) {
                x = d["a"];
            } else {
                x = default(string);
            }
            if (d.ContainsKey("b")) {
                x = d["b"];
            } else {
                x = default(string);
            }
        }
   }

This produces

这产生

00:00:00.7600000
00:00:01.0610000

making the ContainsKey + Itemaccess about 40% slower assuming an even blend of hits and misses.

ContainsKey + Item假设命中和未命中的均匀混合,使访问速度降低约 40%。

Moreover, when I change the program to always miss (i.e. always looking up "b") the two versions become equally fast:

此外,当我将程序更改为始终错过(即始终查找"b")时,这两个版本变得同样快:

00:00:00.2850000
00:00:00.2720000

When I make it "all hits", however, the TryGetValueremains a clear winner:

然而,当我让它“全部成功”时,TryGetValue仍然是一个明显的赢家:

00:00:00.4930000
00:00:00.8110000

回答by davisoa

Making a quick test program, there is definately an improvement using TryGetValue with 1 million items in a dictionary.

制作一个快速测试程序,使用 TryGetValue 和字典中的 100 万个项目肯定会有所改进。

Results:

结果:

ContainsKey + Item for 1000000 hits: 45ms

containsKey + Item for 1000000 点击:45ms

TryGetValue for 1000000 hits: 26ms

1000000 次点击的 TryGetValue:26 毫秒

Here is the test app:

这是测试应用程序:

static void Main(string[] args)
{
    const int size = 1000000;

    var dict = new Dictionary<int, string>();

    for (int i = 0; i < size; i++)
    {
        dict.Add(i, i.ToString());
    }

    var sw = new Stopwatch();
    string result;

    sw.Start();

    for (int i = 0; i < size; i++)
    {
        if (dict.ContainsKey(i))
            result = dict[i];
    }

    sw.Stop();
    Console.WriteLine("ContainsKey + Item for {0} hits: {1}ms", size, sw.ElapsedMilliseconds);

    sw.Reset();
    sw.Start();

    for (int i = 0; i < size; i++)
    {
        dict.TryGetValue(i, out result);
    }

    sw.Stop();
    Console.WriteLine("TryGetValue for {0} hits: {1}ms", size, sw.ElapsedMilliseconds);

}

回答by Rado

Since none of the answers thus far actually answer the question, here is an acceptable answer I found after some research:

由于到目前为止没有一个答案真正回答了这个问题,这是我经过一些研究后发现的一个可以接受的答案:

If you decompile TryGetValue you see that it's doing this:

如果您反编译 TryGetValue,您会看到它正在执行以下操作:

public bool TryGetValue(TKey key, out TValue value)
{
  int index = this.FindEntry(key);
  if (index >= 0)
  {
    value = this.entries[index].value;
    return true;
  }
  value = default(TValue);
  return false;
}

whereas the ContainsKey method is:

而ContainsKey方法是:

public bool ContainsKey(TKey key)
{
  return (this.FindEntry(key) >= 0);
}

so TryGetValue is just ContainsKey plus an array lookup if the item is present.

所以 TryGetValue 只是 ContainsKey 加上数组查找(如果该项目存在)。

Source

来源

It appears that TryGetValue will be almost twice as fast as ContainsKey+Item combination.

看起来 TryGetValue 的速度几乎是 ContainsKey+Item 组合的两倍。

回答by Fwiffo

If you're trying to get out the value from the dictionary, the TryGetValue(key, out value) is the best option, but if you're checking for the presence of the key, for a new insertion, without overwriting old keys, and only with that scope, ContainsKey(key) is the best option, benchmark can confirm this:

如果你想从字典中取出值,TryGetValue(key, out value) 是最好的选择,但如果你正在检查键的存在,新的插入,而不覆盖旧的键,只有在该范围内,ContainsKey(key) 才是最佳选择,基准测试可以确认这一点:

using System;
using System.Threading;
using System.Diagnostics;
using System.Collections.Generic;
using System.Collections;

namespace benchmark
{
class Program
{
    public static Random m_Rand = new Random();
    public static Dictionary<int, int> testdict = new Dictionary<int, int>();
    public static Hashtable testhash = new Hashtable();

    public static void Main(string[] args)
    {
        Console.WriteLine("Adding elements into hashtable...");
        Stopwatch watch = Stopwatch.StartNew();
        for(int i=0; i<1000000; i++)
            testhash[i]=m_Rand.Next();
        watch.Stop();
        Console.WriteLine("Done in {0:F4} -- pause....", watch.Elapsed.TotalSeconds);
        Thread.Sleep(4000);
        Console.WriteLine("Adding elements into dictionary...");
        watch = Stopwatch.StartNew();
        for(int i=0; i<1000000; i++)
            testdict[i]=m_Rand.Next();
        watch.Stop();
        Console.WriteLine("Done in {0:F4} -- pause....", watch.Elapsed.TotalSeconds);
        Thread.Sleep(4000);

        Console.WriteLine("Finding the first free number for insertion");
        Console.WriteLine("First method: ContainsKey");
        watch = Stopwatch.StartNew();
        int intero=0;
        while (testdict.ContainsKey(intero))
        {
            intero++;
        }
        testdict.Add(intero, m_Rand.Next());
        watch.Stop();
        Console.WriteLine("Done in {0:F4} -- added value {1} in dictionary -- pause....", watch.Elapsed.TotalSeconds, intero);
        Thread.Sleep(4000);
        Console.WriteLine("Second method: TryGetValue");
        watch = Stopwatch.StartNew();
        intero=0;
        int result=0;
        while(testdict.TryGetValue(intero, out result))
        {
            intero++;
        }
        testdict.Add(intero, m_Rand.Next());
        watch.Stop();
        Console.WriteLine("Done in {0:F4} -- added value {1} in dictionary -- pause....", watch.Elapsed.TotalSeconds, intero);
        Thread.Sleep(4000);
        Console.WriteLine("Test hashtable");
        watch = Stopwatch.StartNew();
        intero=0;
        while(testhash.Contains(intero))
        {
            intero++;
        }
        testhash.Add(intero, m_Rand.Next());
        watch.Stop();
        Console.WriteLine("Done in {0:F4} -- added value {1} into hashtable -- pause....", watch.Elapsed.TotalSeconds, intero);
        Console.Write("Press any key to continue . . . ");
        Console.ReadKey(true);
    }
}
}

This is a true Example, I have a service that for each "Item" created, it associates a progressive number, this number, each time you create a new item, must be found free, if you delete an Item, the free number becomes free, of course this is not optimized, since I have a static var that caches the current number, but in case you end all the numbers, you can re-begin from 0 to UInt32.MaxValue

这是一个真实的例子,我有一个服务,对于每个创建的“Item”,它关联一个累进编号,这个编号,每次创建新项目时,必须找到空闲,如果删除一个项目,空闲编号变为免费,当然这不是优化的,因为我有一个缓存当前数字的静态变量,但如果你结束所有数字,你可以重新开始从 0 到 UInt32.MaxValue

Test executed:
Adding elements into hashtable...
Done in 0,5908 -- pause....
Adding elements into dictionary...
Done in 0,2679 -- pause....
Finding the first free number for insertion
First method: ContainsKey
Done in 0,0561 -- added value 1000000 in dictionary -- pause....
Second method: TryGetValue
Done in 0,0643 -- added value 1000001 in dictionary -- pause....
Test hashtable
Done in 0,3015 -- added value 1000000 into hashtable -- pause....
Press any key to continue . .

执行的测试:
将元素添加到哈希表中...
在 0,5908 中完成 -- 暂停....
将元素添加到字典中...
在 0,2679 中完成 -- 暂停....
找到第一个插入的空闲数字
第一种方法: ContainsKey
Done in 0,0561 -- 在字典中添加值 1000000 -- 暂停....
第二种方法:TryGetValue
Done in 0,0643 -- 在字典中添加值 1000001 -- 暂停....
测试哈希表
在 0 中完成, 3015 - 将值 1000000 添加到哈希表中 - 暂停....
按任意键继续。.

If some of you may be asking if the ContainsKeys could have an advantage, I've even tried inverting the TryGetValue with Contains key, the result is the same.

如果你们中的一些人可能会问 containsKeys 是否有优势,我什至尝试用 contains 键反转 TryGetValue,结果是一样的。

So, for me, with a final consideration, it all depends on the way the program behaves.

所以,对我来说,最后考虑,这完全取决于程序的行为方式。

回答by AxD

On my machine, with loads of RAM, when run in RELEASE mode (not DEBUG), ContainsKeyequals TryGetValue/try-catchif all entries in the Dictionary<>are found.

在我的机器上,当在 RELEASE 模式(不是 DEBUG)下运行时,有大量 RAM 时,如果找到了中的所有条目,则ContainsKey等于TryGetValue/ 。try-catchDictionary<>

ContainsKeyoutperforms them all by far when there are just a few dictionary entries not found (in my example below, set MAXVALto anything larger than ENTRIESto have some entries missed):

ContainsKey到目前为止,当只有几个字典条目未找到时(在我下面的示例中,设置MAXVAL为大于ENTRIES丢失某些条目的任何值)时,它们的表现都优于它们:

Results:

结果:

Finished evaluation .... Time distribution:
Size: 000010: TryGetValue: 53,24%, ContainsKey: 1,74%, try-catch: 45,01% - Total: 2.006,00
Size: 000020: TryGetValue: 37,66%, ContainsKey: 0,53%, try-catch: 61,81% - Total: 2.443,00
Size: 000040: TryGetValue: 22,02%, ContainsKey: 0,73%, try-catch: 77,25% - Total: 7.147,00
Size: 000080: TryGetValue: 31,46%, ContainsKey: 0,42%, try-catch: 68,12% - Total: 17.793,00
Size: 000160: TryGetValue: 33,66%, ContainsKey: 0,37%, try-catch: 65,97% - Total: 36.840,00
Size: 000320: TryGetValue: 34,53%, ContainsKey: 0,39%, try-catch: 65,09% - Total: 71.059,00
Size: 000640: TryGetValue: 32,91%, ContainsKey: 0,32%, try-catch: 66,77% - Total: 141.789,00
Size: 001280: TryGetValue: 39,02%, ContainsKey: 0,35%, try-catch: 60,64% - Total: 244.657,00
Size: 002560: TryGetValue: 35,48%, ContainsKey: 0,19%, try-catch: 64,33% - Total: 420.121,00
Size: 005120: TryGetValue: 43,41%, ContainsKey: 0,24%, try-catch: 56,34% - Total: 625.969,00
Size: 010240: TryGetValue: 29,64%, ContainsKey: 0,61%, try-catch: 69,75% - Total: 1.197.242,00
Size: 020480: TryGetValue: 35,14%, ContainsKey: 0,53%, try-catch: 64,33% - Total: 2.405.821,00
Size: 040960: TryGetValue: 37,28%, ContainsKey: 0,24%, try-catch: 62,48% - Total: 4.200.839,00
Size: 081920: TryGetValue: 29,68%, ContainsKey: 0,54%, try-catch: 69,77% - Total: 8.980.230,00

Here's my code:

这是我的代码:

    using System;
    using System.Collections.Generic;
    using System.Diagnostics;

    namespace ConsoleApplication1
    {
        class Program
        {
            static void Main(string[] args)
            {
                const int ENTRIES = 10000, MAXVAL = 15000, TRIALS = 100000, MULTIPLIER = 2;
                Dictionary<int, int> values = new Dictionary<int, int>();
                Random r = new Random();
                int[] lookups = new int[TRIALS];
                int val;
                List<Tuple<long, long, long>> durations = new List<Tuple<long, long, long>>(8);

                for (int i = 0;i < ENTRIES;++i) try
                    {
                        values.Add(r.Next(MAXVAL), r.Next());
                    }
                    catch { --i; }

                for (int i = 0;i < TRIALS;++i) lookups[i] = r.Next(MAXVAL);

                Stopwatch sw = new Stopwatch();
                ConsoleColor bu = Console.ForegroundColor;

                for (int size = 10;size <= TRIALS;size *= MULTIPLIER)
                {
                    long a, b, c;

                    Console.ForegroundColor = ConsoleColor.Yellow;
                    Console.WriteLine("Loop size: {0}", size);
                    Console.ForegroundColor = bu;

                    // ---------------------------------------------------------------------
                    sw.Start();
                    for (int i = 0;i < size;++i) values.TryGetValue(lookups[i], out val);
                    sw.Stop();
                    Console.WriteLine("TryGetValue: {0}", a = sw.ElapsedTicks);

                    // ---------------------------------------------------------------------
                    sw.Restart();
                    for (int i = 0;i < size;++i) val = values.ContainsKey(lookups[i]) ? values[lookups[i]] : default(int);
                    sw.Stop();
                    Console.WriteLine("ContainsKey: {0}", b = sw.ElapsedTicks);

                    // ---------------------------------------------------------------------
                    sw.Restart();
                    for (int i = 0;i < size;++i)
                        try { val = values[lookups[i]]; }
                        catch { }
                    sw.Stop();
                    Console.WriteLine("try-catch: {0}", c = sw.ElapsedTicks);

                    // ---------------------------------------------------------------------
                    Console.WriteLine();

                    durations.Add(new Tuple<long, long, long>(a, b, c));
                }

                Console.ForegroundColor = ConsoleColor.Yellow;
                Console.WriteLine("Finished evaluation .... Time distribution:");
                Console.ForegroundColor = bu;

                val = 10;
                foreach (Tuple<long, long, long> d in durations)
                {
                    long sum = d.Item1 + d.Item2 + d.Item3;

                    Console.WriteLine("Size: {0:D6}:", val);
                    Console.WriteLine("TryGetValue: {0:P2}, ContainsKey: {1:P2}, try-catch: {2:P2} - Total: {3:N}", (decimal)d.Item1 / sum, (decimal)d.Item2 / sum, (decimal)d.Item3 / sum, sum);
                    val *= MULTIPLIER;
                }

                Console.WriteLine();
            }
        }
    }

回答by Simon_Weaver

Who cares :-)

谁在乎 :-)

You're probably asking because TryGetValueis a pain to use - so encapsulate it like this with an extension method.

您可能会问,因为TryGetValue使用起来很痛苦 - 所以用扩展方法像这样封装它。

public static class CollectionUtils
{
    // my original method
    // public static V GetValueOrDefault<K, V>(this Dictionary<K, V> dic, K key)
    // {
    //    V ret;
    //    bool found = dic.TryGetValue(key, out ret);
    //    if (found)
    //    {
    //        return ret;
    //    }
    //    return default(V);
    // }


    // EDIT: one of many possible improved versions
    public static TValue GetValueOrDefault<K, V>(this IDictionary<K, V> dictionary, K key)
    {
        // initialized to default value (such as 0 or null depending upon type of TValue)
        TValue value;  

        // attempt to get the value of the key from the dictionary
        dictionary.TryGetValue(key, out value);
        return value;
    }

Then just call :

然后只需调用:

dict.GetValueOrDefault("keyname")

or

或者

(dict.GetValueOrDefault("keyname") ?? fallbackValue) 

回答by debater

All of the answers so far, although good, miss a vital point.

到目前为止,所有答案虽然很好,但都错过了一个关键点。

Methods into the classes of an API (e.g. the .NET framework) form part of an interface definition (not a C# or VB interface, but an interface in the computer science meaning).

进入 API 类(例如 .NET 框架)的方法构成接口定义的一部分(不是 C# 或 VB 接口,而是计算机科学意义上的接口)。

As such, it is usually incorrect to ask whether calling such a method is faster, unless speed is a part of the formal interface definition (which it isn't in this case).

因此,询问调用这样的方法是否更快通常是不正确的,除非速度是正式接口定义的一部分(在这种情况下不是)。

Traditionally this kind of shortcut (combining search and retrieve) is more efficient regardless of language, infrastructure, OS, platform, or machine architecture. It is also more readable, because it expresses your intent explicitly, rather than implying it (from the structure of your code).

传统上,无论语言、基础设施、操作系统、平台或机器架构如何,这种快捷方式(结合搜索和检索)都更有效。它也更具可读性,因为它明确表达了您的意图,而不是暗示它(从您的代码结构中)。

So the answer (from a grizzled old hack) is definitely 'Yes' (TryGetValue is preferable to a combination of ContainsKey and Item [Get] to retrieve a value from a Dictionary).

所以答案(来自一个灰头土脸的老黑客)绝对是“是”(TryGetValue 比从字典中检索值的 ContainsKey 和 Item [Get] 的组合更可取)。

If you think this sounds odd, think of it like this: Even if current implementations of TryGetValue, ContainsKey, and Item [Get] do not yield any speed difference, you can assume it is likely that a future implementation (e.g. .NET v5) will do (TryGetValue will be faster). Think about the lifetime of your software.

如果您觉得这听起来很奇怪,可以这样想:即使 TryGetValue、ContainsKey 和 Item [Get] 的当前实现不会产生任何速度差异,您也可以假设将来的实现(例如 .NET v5)很可能是会做(TryGetValue 会更快)。想想你的软件的生命周期。

As an aside, it is interesting to note that typical modern interface definition technologies still rarely provide any means of formally defining timing constraints. Maybe .NET v5?

顺便说一句,有趣的是,典型的现代接口定义技术仍然很少提供任何正式定义时序约束的方法。也许.NET v5?

回答by Palec

Apart from designing a microbenchmark that will give accurate results in a practical setting, you can inspect the reference source of .NET Framework.

除了设计一个可以在实际环境中给出准确结果的微基准测试之外,您还可以检查 .NET Framework 的参考源。

All of them call the FindEntry(TKey)method that does most of the work and does not memoize its result, so calling TryGetValueis almost twice as fast as ContainsKey+ Item.

它们都调用了FindEntry(TKey)完成大部分工作的方法,并且不会记住其结果,因此调用TryGetValue速度几乎是ContainsKey+ 的两倍Item



The inconvenient interface of TryGetValuecan be adapted using an extension method:

不方便的接口TryGetValue可以使用扩展方法进行调整

using System.Collections.Generic;

namespace Project.Common.Extensions
{
    public static class DictionaryExtensions
    {
        public static TValue GetValueOrDefault<TKey, TValue>(
            this IDictionary<TKey, TValue> dictionary,
            TKey key,
            TValue defaultValue = default(TValue))
        {
            if (dictionary.TryGetValue(key, out TValue value))
            {
                return value;
            }
            return defaultValue;
        }
    }
}

Since C# 7.1, you can replace default(TValue)with plain default. The type is inferred.

从 C# 7.1 开始,您可以替换default(TValue)为普通的default. 类型是推断出来的。

Usage:

用法:

var dict = new Dictionary<string, string>();
string val = dict.GetValueOrDefault("theKey", "value used if theKey is not found in dict");

It returns nullfor reference types whose lookup fails, unless an explicit default value is specified.

它返回null查找失败的引用类型,除非指定了显式默认值。

var dictObj = new Dictionary<string, object>();
object valObj = dictObj.GetValueOrDefault("nonexistent");
Debug.Assert(valObj == null);

val dictInt = new Dictionary<string, int>();
int valInt = dictInt.GetValueOrDefault("nonexistent");
Debug.Assert(valInt == 0);