.net HashSet<T> 与 Dictionary<K, V> 需要搜索时间来查找项目是否存在
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2728500/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
HashSet<T> versus Dictionary<K, V> w.r.t searching time to find if an item exists
提问by halivingston
HashSet<T> t = new HashSet<T>();
// add 10 million items
Dictionary<K, V> t = new Dictionary<K, V>();
// add 10 million items.
Whose .Containsmethod will return quicker?
谁的.Contains方法会更快返回?
Just to clarify, my requirement is I have 10 million objects (well, strings really) that I need to check if they exist in the data structure. I will NEVER iterate.
澄清一下,我的要求是我有 1000 万个对象(好吧,实际上是字符串),我需要检查它们是否存在于数据结构中。我永远不会迭代。
回答by had
HashSet vs List vs Dictionary performance test, taken from here.
HashSet vs List vs Dictionary 性能测试,取自这里。
Add 1000000 objects (without checking duplicates)
添加 1000000 个对象(不检查重复项)


Contains check for half the objects of a collection of 10000
包含对 10000 个集合的一半对象的检查


Remove half the objects of a collection of 10000
删除 10000 个集合的一半对象


回答by Jon Skeet
I assume you mean Dictionary<TKey, TValue>in the second case? HashTableis a non-generic class.
我想你的意思是Dictionary<TKey, TValue>在第二种情况下?HashTable是一个非泛型类。
You should choose the right collection for the job based on your actual requirements. Do you actually wantto map each key to a value? If so, use Dictionary<,>. If you onlycare about it as a set, use HashSet<>.
您应该根据您的实际需求为工作选择合适的系列。你真的想把每个键映射到一个值吗?如果是这样,请使用Dictionary<,>. 如果您只关心它作为一个集合,请使用HashSet<>.
I would expect HashSet<T>.Containsand Dictionary<TKey, TValue>.ContainsKey(which are the comparable operations, assuming you're using your dictionary sensibly) to basically perform the same - they're using the same algorithm, fundamentally. I guess with the entries in Dictionary<,>being larger you end up with a greater likelihood of blowing the cache with Dictionary<,>than with HashSet<>, but I'd expect that to be insignificant compared with the pain of choosing the wrong data type simply in terms of what you're trying to achieve.
我希望HashSet<T>.Contains和Dictionary<TKey, TValue>.ContainsKey(这是可比较的操作,假设您明智地使用您的字典)基本上执行相同的 - 他们从根本上使用相同的算法。我猜随着条目Dictionary<,>更大,你最终用Dictionary<,>比 with更有可能破坏缓存HashSet<>,但我希望这与简单地选择错误数据类型的痛苦相比微不足道。试图实现。
回答by ripvlan
From MSDN documentation for Dictionary<TKey,TValue>
来自 Dictionary<TKey,TValue> 的 MSDN 文档
"Retrieving a value by using its key is very fast, close to O(1), because the Dictionary class is implemented as a hash table."
“使用键检索值非常快,接近O(1),因为 Dictionary 类是作为哈希表实现的。”
With a note:
附注:
"The speed of retrieval depends on the quality of the hashing algorithm of the type specified for TKey"
“检索速度取决于为 TKey 指定的类型的散列算法的质量”
I know your question/post is old - but while looking for an answer to a similar question I stumbled across this.
我知道您的问题/帖子很旧 - 但是在寻找类似问题的答案时,我偶然发现了这个问题。
Hope this helps. Scroll down to the Remarkssection for more details. https://msdn.microsoft.com/en-us/library/xfhwa508(v=vs.110).aspx
希望这可以帮助。向下滚动到备注部分了解更多详情。 https://msdn.microsoft.com/en-us/library/xfhwa508(v=vs.110).aspx
回答by Andrew Bezzub
These are different data structures. Also there is no generic version of HashTable.
这些是不同的数据结构。也没有通用版本的HashTable.
HashSetcontains values of type T which HashTable(or Dictionary) contains key-value pairs. So you should choose collection on what data you need to be stored.
HashSet包含类型为 T 的值,其中HashTable(或Dictionary)包含键值对。因此,您应该根据需要存储的数据选择收集。

