Java HashSet 的迭代顺序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2704597/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 11:02:50  来源:igfitidea点击:

Iteration order of HashSet

javaalgorithmcollectionshashset

提问by eljenso

If every object added to a java.util.HashSet implements Object.equals() and Object.hashCode() in a deterministic fashion, is the iteration order over the HashSet guaranteed to be identical for every identical set of elements added, irrespectiveof the order in which they were added?

如果每个对象添加到java.util.HashSet中器具的Object.Equals()和是Object.hashCode()以确定方式,是迭代顺序在HashSet的保证用于每加入相同组元素是相同的,而不管的添加它们的顺序是什么?

Bonus question: what if the insertion order is identical as well?

额外问题:如果插入顺序也相同怎么办?

(Assuming Sun JDK6 with same HashSet initialization.)

(假设 Sun JDK6 具有相同的 HashSet 初始化。)

Edit:My original question was not clear. It is not about the general contract of HashSet, but what Sun's implementation of HashSet in JDK6 offers as guarantees concerning determinism. Is it inherently non-deterministic? What influences the order used by its Iterator?

编辑:我原来的问题不清楚。它不是关于 HashSet 的一般契约,而是关于 Sun 在 JDK6 中的 HashSet 实现提供的关于确定性的保证。它本质上是不确定的吗?什么影响其迭代器使用的顺序?

采纳答案by Michael Borgwardt

Absolutely not.

绝对不。

The insertion order directly influences the iteration order whenever you have a bucket collision:

每当发生桶碰撞时,插入顺序都会直​​接影响迭代顺序:

When two elements end up in the same bucket, the first one that was inserted will also be the first one returned during iteration, at least if the implementation of collision handling and iteration is straightforward (and the one in Sun's java.util.HashMapis)

当两个元素最终出现在同一个桶中时,插入的第一个元素也将是迭代过程中返回的第一个元素,至少如果碰撞处理和迭代的实现很简单(而 Sun 中的一个java.util.HashMap是)

回答by Itay Maman

No, this is not guaranteed.

不,这不能保证。

First, different JVM may implement the HashSet algorithm differently (as long as it complies with the HashSet specification) so you will get different results on different JVMs.

首先,不同的JVM对HashSet算法的实现可能不同(只要符合HashSet规范即可),所以在不同的JVM上会得到不同的结果。

Second, the algorithm may rely on non-deterministic factors when it builds the different buckets (part of the hash-table algorithm).

其次,该算法在构建不同的桶(哈希表算法的一部分)时可能依赖于非确定性因素。

回答by ewernli

As per the javadoc:

根据javadoc:

This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time. [...] The iterators returned by this class's iterator method are fail-fast: if the set is modified at any time after the iterator is created

这个类实现了 Set 接口,由一个哈希表(实际上是一个 HashMap 实例)支持。它不保证集合的迭代顺序;特别是,它不保证订单会随着时间的推移保持不变。[...] 此类的迭代器方法返回的迭代器是快速失败的:如果在创建迭代器后的任何时间修改了集合

And the method iterator:

和方法iterator

Returns an iterator over the elements in this set. The elements are returned in no particular order.

返回此集合中元素的迭代器。元素没有特定的顺序返回。

So I don't think you can make such an assumption.

所以我认为你不能做出这样的假设。

回答by Péter T?r?k

There is no "official" guarantee for anything like this. I would say it is most probably true for instances of the same HashSet implementation, initialized the same way. But I have seen cases for the iteration order being different between Java 5 and 6, for example.

对于这样的事情,没有“官方”保证。我想说,对于以相同方式初始化的相同 HashSet 实现的实例来说,这很可能是正确的。但是,例如,我见过 Java 5 和 6 之间迭代顺序不同的情况。

Also, it may be different for instances of the same HashSet implementation, initialized with different size, due to rehashing. I.e. if you have 100 elements and two sets, one initialized with a size greater than 100, the other with a much smaller size, the second one will get reallocated and its elements rehashed several times while filling up. This may result in elements mapped to the same bucket being added (and thus iterated over) in different order.

此外,由于重新散列,相同 HashSet 实现的实例可能会不同,初始化为不同的大小。即如果你有 100 个元素和两个集合,一个初始化的大小大于 100,另一个的大小要小得多,第二个将被重新分配,并且它的元素在填充时重新散列多次。这可能会导致映射到同一桶的元素以不同的顺序添加(并因此迭代)。

In Java4 and later, you have LinkedHashSetwhich guarantees that the iteration order will be the order in which its elements were inserted.

在 Java4 及更高版本中,您可以LinkedHashSet保证迭代顺序将是其元素插入的顺序。

回答by Pascal Cuoq

I am sure that the Java developers want you to assume the answer is "no". In particular, for hash tables, why would they make it slower for everyone else who doesn't need this property to guarantee that objects whose hashes clash (identical hashCode % size) are observed in the same order regardless of the order in which they were put in?

我确信 Java 开发人员希望您假设答案是否定的。特别是,对于哈希表,为什么对于不需要此属性的其他人来说,它们会变慢以保证哈希冲突(相同的 hashCode % 大小)的对象以相同的顺序被观察到,而不管它们的顺序如何投放?

回答by ryanprayogo

Such assumption cannot be made. The javadoc says that:

不能做出这样的假设。javadoc 说:

This class implements the Set interface, backed by a hash table (actually a HashMap instance). It makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.

这个类实现了 Set 接口,由一个哈希表(实际上是一个 HashMap 实例)支持。它不保证集合的迭代顺序;特别是,它不保证订单会随着时间的推移保持不变。

The closest you can get is to use a LinkedHashSet, which maintains the insertion order.

您可以获得的最接近的是使用LinkedHashSet,它维护插入顺序。

回答by ahe

Never ever make assumptions about the iteration order of anything you put into a HashSet because its contract explicitly says that you can't count on it in any way. Use LinkedHashSetif you want to maintain insertion order or TreeSetif you want to maintain a natural sorting order.

永远不要对放入 HashSet 的任何内容的迭代顺序进行假设,因为它的契约明确说明您不能以任何方式依赖它。如果要维护插入顺序,请使用LinkedHashSet,如果要维护自然排序顺序,请使用TreeSet

回答by Noah

Wanted to confirm / upvote earlier comments. In short, Do Not Rely on HashSet iteration in consistent order. This can and will introduce bugs in your system.

想要确认/赞成之前的评论。简而言之,不要依赖于一致顺序的 HashSet 迭代。这可能并且会在您的系统中引入错误。

We just found and fixed a bug where the iteration order was inconsistent in HashSet even with:

我们刚刚发现并修复了 HashSet 中迭代顺序不一致的错误,即使是:

  • Identical insertion order.
  • Objects of a class with a valid equals() and hashCode() method.
  • 相同的广告顺序。
  • 具有有效 equals() 和 hashCode() 方法的类的对象。

And fixed it by using LinkedHashSet.

并使用 LinkedHashSet 修复它。

Thanks to the earlier posters :)

感谢早期的海报:)

回答by Peter Lawrey

The order objects appear will depend on the final number of buckets of the HashSet. By changing the load factor and/or initial capacity you can change the order the elements end up in.

出现的订单对象将取决于 HashSet 的最终桶数。通过更改负载因子和/或初始容量,您可以更改元素最终的顺序。

In the following example, you can see these confirguations each result in a different order.

在以下示例中,您可以看到这些配置各自以不同的顺序产生。

public static void main(String...args) throws IOException {
    printOrdersFor(8, 2);
    printOrdersFor(8, 1);
    printOrdersFor(8, 0.5f);
    printOrdersFor(32, 1f);
    printOrdersFor(64, 1f);
    printOrdersFor(128, 1f);
}

public static void printOrdersFor(int size, float loadFactor) {
    Set<Integer> set = new HashSet<Integer>(size, loadFactor);
    for(int i=0;i<=100;i+=10) set.add(i);
    System.out.println("new HashSet<Integer>("+size+", "+loadFactor+") adding 0,10, ... 100 => "+set);
}

prints

印刷

new HashSet<Integer>(8, 2.0) adding 0,10, ... 100 => [0, 50, 100, 70, 40, 10, 80, 20, 90, 60, 30]
new HashSet<Integer>(8, 1.0) adding 0,10, ... 100 => [0, 50, 100, 70, 20, 80, 10, 40, 90, 30, 60]
new HashSet<Integer>(8, 0.5) adding 0,10, ... 100 => [0, 100, 70, 40, 10, 50, 20, 80, 90, 30, 60]
new HashSet<Integer>(32, 1.0) adding 0,10, ... 100 => [0, 100, 70, 40, 10, 50, 80, 20, 90, 60, 30]
new HashSet<Integer>(64, 1.0) adding 0,10, ... 100 => [0, 70, 10, 80, 20, 90, 30, 100, 40, 50, 60]
new HashSet<Integer>(128, 1.0) adding 0,10, ... 100 => [0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100]