Java 默认使用什么散列函数,我们可以覆盖默认行为吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11991589/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-31 07:11:39  来源:igfitidea点击:

What hashing function does Java use by default and can we override the default behavior?

javahash

提问by Geek

I am going through the Introduction of Algorithms by Cormen et alvideo and it discusses several hashing functions . I want to know what hashing function does Java use by default?Does the hashing function actually differ for different types of objects that are used as keys? Is there an api in the Collections framework which let us write our own hashing algorithm ?

我正在阅读Cormen 等人的视频介绍算法,它讨论了几个散列函数。我想知道 Java 默认使用什么散列函数?对于用作键的不同类型的对象,散列函数实际上是否不同?Collections 框架中是否有 api 可以让我们编写自己的散列算法?

回答by assylias

Each object in java has a public int hashCode()method that returns a hash. Each object is free to implement it in its own way by overriding that method. If the method is not overriden, the default Object#hashCodemethodis used.

java中的每个对象都有一个public int hashCode()返回散列的方法。每个对象都可以通过覆盖该方法以自己的方式自由地实现它。如果该方法未被覆盖,则使用默认Object#hashCode方法

You can have look at the source code of various objects to see how it is implemented in the JDK. This is String's hashCodefor example (line 1494).

您可以查看各种对象的源代码,了解它是如何在 JDK 中实现的。例如,这是String 的 hashCode(第 1494 行)。

Some collections can add an additional layer of hashing on top of the objects' hashCode methods. For example, HashMap does that to improve performance when an object's hashCode is not well distributed.

一些集合可以在对象的 hashCode 方法之上添加额外的散列层。例如,当对象的 hashCode 分布不佳时,HashMap 会这样做以提高性能。

回答by Shark

You can always override it in any of your classes... Like

你总是可以在你的任何类中覆盖它......就像

 @override
 public int hashCode()
 { 
 //new implementation 
 }

http://mindprod.com/jgloss/hashcode.html

http://mindprod.com/jgloss/hashcode.html

The default hashCode() method uses the 32-bit internal JVM (Java Virtual Machine) address of the Object as its hashCode.

默认的 hashCode() 方法使用对象的 32 位内部 JVM(Java 虚拟机)地址作为其 hashCode。

However, if the Object is moved in memory during garbage collection, the hashCode stays constant. This default hashCode is not very useful, since to look up an Object in a HashMap, you need the exact same key Object by which the key/value pair was originally filed.

但是,如果对象在垃圾回收期间在内存中移动,则 hashCode 保持不变。这个默认的 hashCode 不是很有用,因为要在 HashMap 中查找对象,您需要与最初提交键/值对的键对象完全相同的键对象。

Normally, when you go to look up, you don't have the original key Object itself, just some data for a key. So, unless your key is a String, nearly always you will need to implement a hashCode and equals method on your key class.

通常,当您去查找时,您没有原始键 Object 本身,只有键的一些数据。因此,除非您的键是字符串,否则几乎总是需要在键类上实现 hashCode 和 equals 方法。

Object.hashCode() is a native method.

Object.hashCode() 是本机方法。

回答by matsev

It depends on the kind of object that you use. For any object that you implement in your own classes, you can always override the default hashCode()method.

这取决于您使用的对象类型。对于您在自己的类中实现的任何对象,您始终可以覆盖默认的hashCode()方法。

Note, you should always obey the contract between hashCode()and equals()as mentioned in the hashCode() javadoc:

请注意,您应该始终遵守hashCode() javadoc 中提到的hashCode()和之间的合同equals()

If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

如果根据 equals(Object) 方法两个对象相等,则对两个对象中的每一个调用 hashCode 方法必须产生相同的整数结果。

For more information read this entry.

有关更多信息,请阅读此条目

回答by Haozhun

Every type in Java has hashCode()method defined, as it's in the Object. hashCode()returns a int. And in HashMapimplementation, it hashes the result again and take only the lower bits to make it in the range of 0 to size-1. Note in Sun JDK, size is always 2x, x being some integer.

Java 中的每种类型都hashCode()定义了方法,就像在Object. hashCode()返回一个int. 在HashMap实现中,它再次对结果进行散列,只取低位使其在 0 到size-1的范围内。请注意,在 Sun JDK 中,大小始终为 2 x, x 是某个整数。

Java library is open source and you probably have a copy on your dev machine.

Java 库是开源的,您的开发机器上可能有一个副本。

In Sun JDK 6, the second hash I mentioned above is

在 Sun JDK 6 中,我上面提到的第二个哈希是

/**
 * Applies a supplemental hash function to a given hashCode, which
 * defends against poor quality hash functions.  This is critical
 * because HashMap uses power-of-two length hash tables, that
 * otherwise encounter collisions for hashCodes that do not differ
 * in lower bits. Note: Null keys always map to hash 0, thus index 0.
 */
static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

You can find the first hash by looking at the hashCode()function on the class you are interested in.

您可以通过查看hashCode()您感兴趣的类上的函数来找到第一个哈希值。

回答by Jochen

All classes in Java inherit from java.lang.Object, and by doing so, they inherit the method hashCode()that returns an int. The default method returns some (more or less) unique value created by the VM (think of it as the memory address of the object, even though that's not entirely correct). When you implement your own classes you can override that method to do whatever you want. You should however pay attention that your hashCodeand equalsmethods are consistent, and you should be aware that in general hash codes are not unique, so whatever you do, expect collisions between hash codes of different objects.

Java 中的所有类都继承自java.lang.Object,并且通过这样做,它们继承了hashCode()返回int. 默认方法返回一些(或多或少)由 VM 创建的唯一值(将其视为对象的内存地址,即使这并不完全正确)。当您实现自己的类时,您可以覆盖该方法以执行您想要的任何操作。但是,您应该注意您的hashCodeequals方法是一致的,并且您应该意识到通常哈希码不是唯一的,因此无论您做什么,都期望不同对象的哈希码之间发生冲突。

The Collections framework usually uses the hashhCode()method to hash things for Hashtables etc. It is conceivable that other datastructures in other libraries use explicit hash functions, but that does not happen in the Collections framework.

Collections 框架通常使用hashhCode()Hashtables 等的方法对事物进行散列。可以想象,其他库中的其他数据结构使用显式散列函数,但在 Collections 框架中不会发生这种情况。