java 哈希图的缺点是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/6924852/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 17:55:15  来源:igfitidea点击:

What are the disadvantages to hashmaps?

c#javadata-structures

提问by Jean

Whatever language I use, I always aim to use the equivalent of a hashmap. However, I was going through some practice interview questions and it asked what is the limitation to this?

无论我使用什么语言,我总是打算使用等效的哈希图。但是,我正在做一些练习面试问题,它问这个有什么限制?

The only reason I could think of is limited main memory, but then that wouldn't be limited only to hashmaps, but also ArrayLists etc etc.

我能想到的唯一原因是主内存有限,但这不仅限于哈希图,还包括 ArrayLists 等。

采纳答案by hakon

There's also the potential for collisions. The cost of writing and/or executing the hashing-function could be high if the requirement for collision avoidance is strict, or if you have a small hash-space.

还有可能发生碰撞。如果避免碰撞的要求很严格,或者如果你有一个小的散列空间,那么编写和/或执行散列函数的成本可能会很高。

回答by Paul Ruane

  1. Whilst hash-tables have constant time insertion, a hash-table will occasionally need to grow its internal structure and re-bucket its entries. This is an operation that has a cost proportional to the current size of the hash-table. The result of this is that insertion time is not always consistent, i.e. insertion will be constant, O(1), but occasionally you will notice a linear delay, O(n)as the table is grown. (This behaviour characteristic has led some to suggest favouring a tree over hash-table in the default/na?ve case.)
  2. You need to make sure the hashing algorithm of the item you are adding is sound. What this means that for an arbitrary set of elements, the resultant hash-codes are spread well across the range of the hash-code type (in Java and C# this is int). If you have a number of items with the same value (zero anyone?) then your hash-table will degrade into an elaborate linked-list and performance will dramatically decrease.
  3. You need to ensure that the hash-code of your items does not change over time and that the equality method (Java's equals()or .NET's Equals()) is implemented to compare the same set of fields used for the hash-code. (Ideally this would mean the objects you add to the table are immutable but alternatively you may instead make sure that any mutable fields have no bearing on the hash-code calculation and equals method: a risky strategy. With changing hash-codes the table will not be able to find the entries you have already added to it when you later come to retrieve them.
  4. Hash-tables do not, generally, preserve ordering -- be it natural ordering or order of insertion. (Those that do typically employ a parallel structure to maintain the ordering, or else perform a relatively expensive sort at time of iteration.)
  1. 虽然哈希表具有恒定的时间插入,但哈希表偶尔需要增加其内部结构并重新存储其条目。这是一个成本与哈希表的当前大小成正比的操作。这样做的结果是插入时间并不总是一致的,即插入将是恒定的O(1),但偶尔您会注意到O(n)随着表的增长而出现线性延迟。(这种行为特征导致一些人建议在默认/天真情况下优先使用树而不是哈希表。)
  2. 您需要确保您添加的项目的哈希算法是合理的。这意味着对于任意一组元素,生成的哈希码在哈希码类型的范围内分布得很好(在 Java 和 C# 中,这是int)。如果您有许多具有相同值的项目(任何人为零?),那么您的哈希表将退化为一个精心设计的链表,并且性能将急剧下降。
  3. 您需要确保您的项目的哈希码不会随着时间的推移而改变,并且实现了相等方法(Java 的equals()或 .NET 的Equals())来比较用于哈希码的相同字段集。(理想情况下,这意味着您添加到表中的对象是不可变的,但或者您可以确保任何可变字段对哈希码计算和 equals 方法没有影响:一种冒险的策略。更改哈希码后,表将当您稍后检索它们时,无法找到您已经添加到其中的条目。
  4. 哈希表通常不保留顺序——无论是自然顺序还是插入顺序。(那些通常使用并行结构来维护排序,或者在迭代时执行相对昂贵的排序。)

See also:

也可以看看:

回答by Qwerky

Use the right data structure for the right job. If you don't need access by a key, don't use a Map.

为正确的工作使用正确的数据结构。如果您不需要通过密钥访问,请不要使用Map.

In terms of HashMaplimitations, I guess it can suffer if items have a bad hashing algorithm, but thats about it.

HashMap限制而言,我想如果项目的散列算法不好,它会受到影响,但仅此而已。

回答by CloudyMarble

Chained hash tables also inherit the disadvantages of linked lists. When storing small keys and values, the space overhead of the next pointer in each entry record can be significant. An additional disadvantage is that traversing a linked list has poor cache performance, making the processor cache ineffective.

链式哈希表也继承了链表的缺点。当存储小的键和值时,每个条目记录中下一个指针的空间开销可能很大。另一个缺点是遍历链表的缓存性能较差,使得处理器缓存无效。

from Wikipedia - Hash Tables

来自维基百科 - 哈希表

回答by porges

One (very important) limitation is that you shouldn't use them with types that have unstable (mutable) hashcodes. Here's Eric Lippert on the subject.

一个(非常重要的)限制是您不应该将它们用于具有不稳定(可变)哈希码的类型。这是 Eric Lippert 的主题

回答by Mulki

Hash map usage is situational.

哈希映射的使用视情况而定。

If your Hash key is not chosen well ur hash map run at the speed equivalent to that of a list, with the added issue of huge memory hog.

如果您的哈希键没有选择好,您的哈希映射将以与列表相同的速度运行,但会增加大量内存占用的问题。

In general Hashmaps are a bad choice when ur gonna perform iterative tasks on your data.

一般来说,当你要对你的数据执行迭代任务时,Hashmaps 是一个糟糕的选择。

回答by Asgeir

Two things I can think of. One is that you can't guarantee ordering (stable or otherwise) when iterating through a hashmap. The other is that they have the possibility of thrashing your cache when you iterate over them.

我能想到的两件事。一是在迭代哈希图时不能保证排序(稳定或其他)。另一个是当您迭代它们时,它们有可能破坏您的缓存。

回答by Basil Bourque

Map may be persistent

地图可能是持久的

The only reason I could think of is limited main memory, but then that wouldn't be limited only to hashmaps, but also ArrayLists etc etc.

我能想到的唯一原因是主内存有限,但这不仅限于哈希图,还包括 ArrayLists 等。

A map need not be limited to memory.

地图不必限于内存。

Some databases provide a persistentkey-value storesuch as hstorein Postgres, or MVStorein H2 Database Engine. That second one uses the same Mapinterface defined in Java as do the in-memory implementations.

有些数据库提供持久的key-value存储诸如hstorePostgres里,或MVStore在H2数据库引擎。第二个使用与Map内存中实现相同的接口定义在 Java 中。

A key-valuemap may also be distributed across a network of computers, persisting parts of the map. There are several such products available.

键值地图也可以跨计算机网络分发,持续地图的部分。有几种这样的产品可用。

Considerations such as concurrency, nulls, and iteration order

并发性、空值和迭代顺序等注意事项

Characteristics vary among different implementations of a key-value store, commonly called a map or dictionary. You mentioned HashMapbut that is only one way to do a map. There are skip listmaps, and there are maps to track objects by reference (pointer) rather than by the content of the key as does a typical hashmap. In Java, an EnumMapis highly optimized for the case of the keys being based on an Enumsubclass, with items represented internally as a bit-map of all positions defined in the enum, yielding very fast execution and taking very little memory. Some implementations may be more highly concurrent that others depending on the amount of data, such as ConcurrentSkipListMapin Java.

键值存储(通常称为映射或字典)的不同实现之间的特性各不相同。您提到了HashMap但这只是制作地图的一种方法。有跳过列表映射,还有一些映射通过引用(指针)而不是像典型的哈希映射那样通过键的内容来跟踪对象。在 Java 中, anEnumMap针对基于Enum子类的键的情况进行了高度优化,项目在内部表示为枚举中定义的所有位置的位图,产生非常快的执行和占用很少的内存。根据数据量,某些实现可能比其他实现具有更高的并发性,例如ConcurrentSkipListMap在 Java 中。

Some maps may accept or forbid nulls in the key and/or the value. This may assist or violate the needs of your business rules.

某些映射可能接受或禁止键和/或值中的空值。这可能有助于或违反您的业务规则的需要。

In some cases you may want to maintain a sort order or an original insertion order among your keys.

在某些情况下,您可能希望在键之间保持排序顺序或原始插入顺序。

Here is a list I made of the 10 Mapimplementations provided with Java 11. You can compare the various aspects as pros and cons depending on your needs.

这是我列出的MapJava 11 提供的 10 个实现。您可以根据需要比较各个方面的优缺点。

Table of map implementations in Java 11, comparing their features

Java 11 中的地图实现表,比较它们的特性

回答by AlexR

They mean that the order of elements is not preserved in HashMap. The next question is "how to solve this problem." And the answer is: use LinkedHashMap to be able to get elements in the same order they were inserted and TreeMap with appropriate comparator to control the order by any criteria you want.

它们意味着 HashMap 中不保留元素的顺序。下一个问题是“如何解决这个问题”。答案是:使用 LinkedHashMap 能够以插入的相同顺序获取元素,并使用适当的比较器使用 TreeMap 来根据您想要的任何条件控制顺序。

回答by Steve

The typical alternative to hash tables is a binary tree. While hash tables are typically faster the contents are not in any meaningful order; with binary trees the contents are sorted.

哈希表的典型替代品是二叉树。虽然哈希表通常更快,但内容没有任何有意义的顺序;使用二叉树对内容进行排序。