Java 字符串作为哈希图中的键

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18029983/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 21:53:46  来源:igfitidea点击:

String as key in hashmap

java

提问by rickygrimes

I have read a lot of posts in the past one hour, but I am still not very clear with the concept of using immutable objects as keys in a Hashmap. I have a hashmap that has its key as a String. The value in the hashmap is MyStore, where MyStore represents information regarding stores I own. The String represents address. In my code, the logic I have is, I first look in the map for that key, if present --> get its value, if its not present put it in hashmap. My manager just told me the key will change in the future, that is address of my stores will change in the future. He said in that case, my logic of first checking if the key exists won't work. I don't understand what he means here. I want to understand the below points very clearly -

在过去的一个小时里我读了很多帖子,但我对使用不可变对象作为 Hashmap 中的键的概念仍然不是很清楚。我有一个哈希图,它的键是一个字符串。哈希图中的值是 MyStore,其中 MyStore 表示有关我拥有的商店的信息。字符串代表地址。在我的代码中,我的逻辑是,我首先在映射中查找该键(如果存在)-> 获取其值,如果不存在则将其放入哈希映射中。我的经理刚刚告诉我密钥将来会改变,即我的商店地址将来会改变。他说在那种情况下,我首先检查密钥是否存在的逻辑将不起作用。我不明白他在这里是什么意思。我想非常清楚地了解以下几点 -

  1. Difference between mutable and immutable keys for a hashmap.
  2. What happens if you use a immutable key that can change? - I know this doesn't make sense, but I want to clearly understand what my manager is talking about here.
  3. Some posts talk about Strings if used as keys in a hashmap cache their hashcode -What does this mean?
  4. If lets say I used mutable objects as keys in my hashmap that implemented hashcode and equals, then will it work? I am assuming it will because if the key changes, the contains method will look if the key is present. If it is not present, it will putthe entry so you can getit in the future.
  1. 哈希映射的可变键和不可变键之间的区别。
  2. 如果您使用可以更改的不可变键会发生什么?- 我知道这没有意义,但我想清楚地了解我的经理在这里谈论的是什么。
  3. 一些帖子讨论了字符串如果用作哈希映射缓存中的哈希码的键 - 这是什么意思?
  4. 如果可以说我在我的哈希图中使用可变对象作为键来实现哈希码和等于,那么它会起作用吗?我假设会这样,因为如果密钥更改,则 contains 方法将查看密钥是否存在。如果它不存在,它将放置该条目,以便您将来可以获取它。

I don't mean to create a duplicate post if this has been discussed before. If I missed reading the post that has answers to all my questions, please point me to it. If not, please explain in layman terms the above questions I have so it is useful in the future for other readers :). Feel free to edit my post's subject so in future if anyone has a similar question, they land here directly :)

如果之前已经讨论过,我并不是要创建重复的帖子。如果我错过了回答我所有问题的帖子,请指点我。如果没有,请用通俗的语言解释我的上述问题,以便将来对其他读者有用:)。随意编辑我帖子的主题,所以以后如果有人有类似的问题,他们会直接在这里:)

采纳答案by bowmore

First : how does a HashMap work?

第一:HashMap 是如何工作的?

Basically it has an array and when you put a key-value pair in the map, it is stored in one of the positions in the array. The position in the array is chosen based on the result of the key's hashCode()passed to a hashing method. Why is that? Well, if you request the value for a certain key, the index in the array to find the key and its associated value can simply be recalculated to find the index in the array again. (Some more logic is needed to deal with keys that map to the same index, but I'm just trying to get you to understand the basic mechanism) Then equals()is used to verify if the key at the calculated index is indeed the requested key.

基本上它有一个数组,当你在地图中放置一个键值对时,它被存储在数组中的一个位置。根据hashCode()传递给散列方法的键的结果选择数组中的位置。这是为什么?好吧,如果您请求某个键的值,则可以简单地重新计算数组中查找键及其关联值的索引以再次查找数组中的索引。(需要更多的逻辑来处理映射到同一索引的键,但我只是想让您了解基本机制) Thenequals()用于验证计算索引处的键是否确实是请求的键。

  1. From this it should be a bit more clear why immutable keys are better than mutable keys. An immutable key will always keep the same hashCode()value, and the hashing function will find the correct bucket ( = index in the hashMap's array) again.

    That doesn't mean mutable keys cannot work. A mutable key will work if mutations on the key do not affect the hash code or if the keys are simply not mutated as long as the hashMap is used.

  2. How can an immutable key change? Well, the key itself may not be able to change, but the key-value mapping can change in business logic. If you create a map, using the address as a key, you rely on the fact that a store's address will not change. If a store's address changes, you'll not find it in the Map using its new address as a key. Your manager has a valid point.

  3. the speed of finding a key in a Map highly depends on the speed of calculating the hashCode. For a String this calculation loops over all the characters in the String. If you use long Strings as keys and have a lot of Map access this may lead to a performance bottle neck. Java's String implementation therefore caches the hash value, so it will be calculated only once. However you'll only avoid calculating the hash code if you use the same Stringinstance again (new instances will not have the cached value). You may intern()the keys you use, but consider this only if it can be shown that there really is a performance bottle neck, as Stringinterning does come with its own overhead.

  4. as explained in 1 : mutable keys can work if their hash code is not affected by mutations. e.g. using a Customer as key, where the hashCode()is based only on the customer's name, then a Customer implementation that only does not allow the name to change, yet allows other values to change, is a reliable key.

  1. 由此应该更清楚为什么不可变键比可变键更好。不可变键将始终保持相同的hashCode()值,并且哈希函数将再次找到正确的桶(= hashMap 数组中的索引)。

    这并不意味着可变键不能工作。如果键上的突变不影响哈希码,或者只要使用 hashMap 键就不会发生突变,则可变键将起作用。

  2. 不可变的密钥如何改变?好吧,键本身可能无法更改,但是键值映射可以在业务逻辑中更改。如果您使用地址作为键创建地图,则您依赖于商店地址不会更改的事实。如果商店的地址发生变化,您将无法使用其新地址作为关键字在地图中找到它。你的经理有一个有效的观点。

  3. 在 Map 中查找键的速度很大程度上取决于计算 hashCode 的速度。对于字符串,此计算循环遍历字符串中的所有字符。如果您使用长字符串作为键并且有很多 Map 访问权限,这可能会导致性能瓶颈。因此,Java 的 String 实现缓存了哈希值,因此它只会被计算一次。但是,如果您String再次使用相同的实例(新实例将不具有缓存值),则只会避免计算哈希码。您可以intern()使用您使用的键,但只有在可以证明确实存在性能瓶颈时才考虑这一点,因为String实习确实有其自身的开销。

  4. 如 1 中所述:如果它们的哈希码不受突变影响,可变键可以工作。例如,使用 Customer 作为键,其中hashCode()仅基于客户的名称,那么只允许名称更改但允许其他值更改的 Customer 实现是一个可靠的键。

回答by Vadim

In general, keys in hashmaps should be immutable .

一般来说,哈希图中的键应该是不可变的。

See this

看到这个

Note: great care must be exercised if mutable objects are used as map keys. The behavior of a map is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is a key in the map.

注意:如果将可变对象用作映射键,则必须非常小心。如果对象的值以影响等于比较的方式更改,而对象是映射中的键,则不会指定映射的行为。

The hash of your key is calculated once during insert, the hashmap stores the it and it will notget automatically updated once your key is modified. That's why there's an a assumption that keys will be immutable.

您的键的哈希在插入期间计算一次,哈希映射存储它,一旦您的键被修改,它就不会自动更新。这就是为什么假设键是不可变的。

Your options are: 1. Don't use mutable objects as keys. Try to find another key, or use an immutable part of your former key object 2. Don't change your mutable objects while they are used as keys

您的选择是: 1. 不要使用可变对象作为键。尝试找到另一个键,或使用您以前的键对象的不可变部分 2. 不要更改用作键的可变对象

回答by Tala

  1. There might be a problem if you modify you mutable object used as key. map.containsKey(modifiedKey)might return falseeven if key is there, You'll have to iterate over keys to find it. So try to use immutable or don't modify mutable while it is a key.

  2. Immutable object never changes. There are methods that look like they're changing the object but instead a new copy is created. Example:

    String a = "A";

    String b = a.substring(0); // substring created a copy of "A" with a not being modified at all.

    a = a + b; // a+b create a new String "AA" with no modification of previous ones.

  3. This might help caching-hashes-in-java-collectionsalso this is great why-are-immutable-objects-in-hashmaps-so-effective

  4. String has already implementation of equalsand hashcode, no need to invent another class to use instead of it unless you're absolutely sure you need it.

    As mention in point 1. you can do that but you'll have to be careful and not modify your mutable objects. That's not a very good practice though.

  1. 如果您修改用作键的可变对象,则可能会出现问题。即使键存在,map.containsKey(modifiedKey)也可能返回false,您必须遍历键才能找到它。所以尽量使用 immutable 或者不要在 mutable 是 key 的情况下修改 mutable。

  2. 不可变对象永远不会改变。有些方法看起来像是在更改对象,但会创建一个新副本。例子:

    字符串 a = "A";

    字符串 b = a.substring(0); // substring 创建了一个“A”的副本,根本没有被修改。

    a = a + b; // a+b 创建一个新的字符串“AA”,不修改之前的字符串。

  3. 这可能有助于缓存-hashes-in-java-collections这也是为什么-不可变对象-in-hashmaps-如此有效的原因

  4. String 已经实现了equalsand hashcode,除非您绝对确定需要它,否则无需发明另一个类来代替它。

    如第 1 点所述,您可以这样做,但您必须小心,不要修改您的可变对象。但这不是一个很好的做法。

回答by bsd

  1. Immutable keys cannot change. Consequently, the hashcode which is computed at the time of insertion cannot change. So when you try to getan element from the map, the hashcode of the object to get is computed against known hashcodes. Had your key changed from outside(it was mutable), the new key's hashcode would be different from the one which you inserted.

  2. Let's see an example. for(2and 4)

    public class RandomPair {
        int p;
        int q;
    
        public RandomPair(int p, int q) {
            this.p = p;
            this.q = q;
        }
        @Override
        public int hashCode() {
            return 31 * p + q;
        }
    
        @Override
        public boolean equals(Object obj) {
            if (!(obj instanceof RandomPair)) {
                return false;
            }
            if (obj == this) {
               return true;
            }
    
            RandomPair other = (RandomPair) obj;
            if (p != other.p)
                return false;
            if (q != other.q)
                return false;
            return true;
        }
    
        public static void main(String[] args) {
            RandomPair pair = new RandomPair(10, 10);
            Map<RandomPair, Integer> map = new HashMap<RandomPair, Integer>();
    
            map.put(pair, 1);
            System.out.println(map.get(pair)); //returns 1
    
            //someone somewhere just changed the value of pair
            pair.p = 20;
            //the object was the same, someone somewhere just changed value of pair and now you can't 
            //find it in the map
            System.out.println(map.get(pair));
    
            //had you made p and q final, this sort of modification wouldn't be possible
           //Strings are immutable and thus prevent this modification
        }
    }
    
  3. Since strings are immutable, the hashcode value once computed can be reused again. The hashcode is lazily computed. ie on the first call to hashcode and then the value of hashcode is cached.

  1. 不可变的键不能改变。因此,插入时计算的哈希码不能改变。因此,当您尝试从地图中获取元素时,要获取的对象的哈希码将根据已知的哈希码进行计算。如果您的密钥从外部更改(它是可变的),新密钥的哈希码将与您插入的不同。

  2. 让我们看一个例子。对于(24

    public class RandomPair {
        int p;
        int q;
    
        public RandomPair(int p, int q) {
            this.p = p;
            this.q = q;
        }
        @Override
        public int hashCode() {
            return 31 * p + q;
        }
    
        @Override
        public boolean equals(Object obj) {
            if (!(obj instanceof RandomPair)) {
                return false;
            }
            if (obj == this) {
               return true;
            }
    
            RandomPair other = (RandomPair) obj;
            if (p != other.p)
                return false;
            if (q != other.q)
                return false;
            return true;
        }
    
        public static void main(String[] args) {
            RandomPair pair = new RandomPair(10, 10);
            Map<RandomPair, Integer> map = new HashMap<RandomPair, Integer>();
    
            map.put(pair, 1);
            System.out.println(map.get(pair)); //returns 1
    
            //someone somewhere just changed the value of pair
            pair.p = 20;
            //the object was the same, someone somewhere just changed value of pair and now you can't 
            //find it in the map
            System.out.println(map.get(pair));
    
            //had you made p and q final, this sort of modification wouldn't be possible
           //Strings are immutable and thus prevent this modification
        }
    }
    
  3. 由于字符串是不可变的,因此可以再次重用计算过的哈希码值。哈希码是惰性计算的。即在第一次调用 hashcode 然后缓存 hashcode 的值。

回答by Lokesh

  1. A mutable key or object means that you can modify the object [by modify i mean you can change values represented by object]. This will impact its storage in HashMapif the logic written in equals and hashcode uses these modifiable value.

  2. Immutability ideally means that object once initialized cannot be changed after that. But if we talk specifically in terms of HashMapthen all the variables which are used inside equals and hashcode method, if they can be modified then that object should not be used as key else it can be used as key [but still not recommended].

  3. Its not just about String, any about would cache its hashcode. Hashcode is generated again and again for almost all objects [There is a reason why i say almost as in some cases it can change]. Hashcode is cached in Object header.

  4. If you want to use mutable object as key then you should go for IdentityHashMap. Just read about them, they can be useful in such cases.

  1. 可变键或对象意味着您可以修改对象 [通过修改我的意思是您可以更改对象表示的值]。HashMap如果写入的逻辑 equals 和 hashcode 使用这些可修改值,这将影响其存储。

  2. 理想情况下,不变性意味着对象一旦初始化就不能在此之后更改。但是,如果我们专门讨论HashMap在 equals 和 hashcode 方法中使用的所有变量,如果它们可以被修改,那么该对象不应该用作键,否则它可以用作键 [但仍然不推荐]。

  3. 它不仅仅是 about String,任何 about 都会缓存它的哈希码。几乎所有对象都会一次又一次地生成哈希码 [我说几乎在某些情况下它可以改变是有原因的]。哈希码缓存在对象头中。

  4. 如果您想使用可变对象作为键,那么您应该选择IdentityHashMap. 只需阅读它们,它们在这种情况下会很有用。