java 为什么 HashMap containsKey 比 Sun JDK 中的 get 慢?(sun-jdk-1.6.0.17)

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7999546/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-30 22:37:51  来源:igfitidea点击:

Why is HashMap containsKey slower than get in Sun JDK? (sun-jdk-1.6.0.17)

javahashmapjdk1.6

提问by Stefan

Why is calling containsKeyon a HashMap slower then get?

为什么调用containsKeyHashMap 会变慢get呢?

Test: http://ideone.com/QsWXF(>15% difference, run on sun-jdk-1.6.0.17)

测试:http: //ideone.com/QsWXF(>15% 差异,在 sun-jdk-1.6.0.17 上运行)

回答by

Because it does [ever so slightly] more work, see the OpenJDK 7 source.

因为它做了 [稍微] 更多的工作,请参阅OpenJDK 7 源代码



Note that containsKeycalls getEntrywhile getdirectly "does the magic lookup". I do not know why it was done this way, and am further puzzled by the use/not use of getForNullKey:See John B's and Ted Hopps's comments as to why this is done.

请注意,containsKey调用getEntrywhileget直接“进行魔术查找”。我不知道为什么要这样做,并且对使用/不使用以下内容感到更加困惑getForNullKey请参阅 John B 和 Ted Hopps 关于为什么要这样做的评论。

gethas an early code split for a null-key (note that getwill return nullif the entry doesn't exist or existed with a null value stored):

get有一个空键的早期代码拆分(请注意,如果条目不存在或存在空值,get则将返回null):

315           if (key == null)
316               return getForNullKey();
...
322               if (e.hash == hash &&
                      ((k = e.key) == key || key.equals(k)))
323                   return e.value;

While getEntry, called from containsKey, does not split to getForNullKeyand there is additional work here to check for the null-key case (for each Entry scanned in the chain):

While getEntry,从 调用containsKey,不会拆分为getForNullKey,这里有额外的工作来检查空键情况(对于链中扫描的每个条目):

366               if (e.hash == hash &&
367                   ((k = e.key) == key || (key != null && key.equals(k))))
368                   return e;

Also, containsKeyhas the additional conditional and method call (note that getEntrywill return an Entry object, if said key exists, even if the stored value is null):

此外,containsKey还有额外的条件和方法调用(请注意getEntry,如果所述键存在,则将返回一个 Entry 对象,即使存储的值为null):

352           return getEntry(key) != null;

I suppose it could be argued that containsKeywould benefit - in terms of "performance" - from having a specialized form (at the expense of less-DRY code), or that getEntrycould follow the lead of getwith an early null-key check .. on the other-hand, it might be argued that getshould be written in terms of getEntry;-)

我想可以争论说,containsKey在“性能”方面会受益于具有专门的形式(以减少 DRY 代码为代价),或者getEntry可以效仿get早期的空键检查 ..另一方面,可能有人认为get应该用getEntry;-)

Happy coding.

快乐编码。

回答by Ted Hopp

I haven't tried to reproduce your results yet, but my first guess is that getsimply returns the value (if any) that was found, while containsKey(which is what you tested, not contains) needs to test whether the key exists (with either a null or non-null value) and then return a boolean. Just a little more overhead involved.

我还没有尝试重现您的结果,但我的第一个猜测是get只返回找到的值(如果有),而containsKey(这是您测试的,不是contains)需要测试密钥是否存在(使用null 或非 null 值),然后返回一个布尔值。只是涉及更多的开销。

回答by user541686

Let's see the source code:

让我们看看源代码:

public V get(Object key) {
    if (key == null)
        return getForNullKey();
    int hash = hash(key.hashCode());
    for (Entry<K,V> e = table[indexFor(hash, table.length)];
         e != null;
         e = e.next) {
        Object k;
        if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
            return e.value;
    }
    return null;
}


public boolean containsKey(Object key) {
    return getEntry(key) != null;
}

final Entry<K,V> getEntry(Object key) {
    int hash = (key == null) ? 0 : hash(key.hashCode());
    for (Entry<K,V> e = table[indexFor(hash, table.length)];
         e != null;
         e = e.next) {
        Object k;
        if (e.hash == hash &&
            ((k = e.key) == key || (key != null && key.equals(k))))
            return e;
    }
    return null;
}

Maybe it's because of the extra method call or because of the repeated check for key != null?

也许是因为额外的方法调用或因为重复检查key != null?

回答by Peter Lawrey

Testing different sizes of hashmaps, if there is a bias in performance, its very small.

测试不同大小的hashmap,如果性能有偏差,则很小。

Running on Java 7 update 1 with a 4.6 GHz i7 2600K.

在带有 4.6 GHz i7 2600K 的 Java 7 update 1 上运行。

public class HashMapPerfMain {
    public static void main(String... args) {
        Integer[] keys = generateKeys(2 * 1000 * 1000);

        Map<Integer, Boolean> map = new HashMap<Integer, Boolean>();
        for (int j = 0; j < keys.length; j += 2)
            map.put(keys[j], true);

        for (int t = 0; t < 5; t++) {
            long start = System.nanoTime();
            int count = countContainsKey(map, keys);
            long time = System.nanoTime() - start;
            assert count == keys.length / 2;

            long start2 = System.nanoTime();
            int count2 = countGetIsNull(map, keys);
            long time2 = System.nanoTime() - start2;
            assert count2 == keys.length / 2;
            System.out.printf("Map.containsKey avg %.1f ns, ", (double) time / keys.length);
            System.out.printf("Map.get() == null avg %.1f ns, ", (double) time2 / keys.length);
            System.out.printf("Ratio was %.2f%n", (double) time2/ time);
        }
    }

    private static int countContainsKey(Map<Integer, Boolean> map, Integer[] keys) {
        int count = 0;
        for (Integer i : keys) {
            if (map.containsKey(i)) count++;
        }
        return count;
    }

    private static int countGetIsNull(Map<Integer, Boolean> map, Integer[] keys) {
        int count = 0;
        for (Integer i : keys) {
            if (map.get(i) == null) count++;
        }
        return count;
    }

    private static Integer[] generateKeys(int size) {
        Integer[] keys = new Integer[size];
        Random random = new Random();
        for (int i = 0; i < keys.length; i++)
            keys[i] = random.nextInt();
        return keys;
    }
}

prints for half million keys

打印 50 万把钥匙

Map.containsKey avg 27.1 ns, Map.get() == null avg 26.4 ns, Ratio was 0.97
Map.containsKey avg 19.6 ns, Map.get() == null avg 19.6 ns, Ratio was 1.00
Map.containsKey avg 18.3 ns, Map.get() == null avg 19.0 ns, Ratio was 1.04
Map.containsKey avg 18.2 ns, Map.get() == null avg 19.1 ns, Ratio was 1.05
Map.containsKey avg 18.3 ns, Map.get() == null avg 19.0 ns, Ratio was 1.04

prints for one million keys

打印一百万把钥匙

Map.containsKey avg 30.9 ns, Map.get() == null avg 30.9 ns, Ratio was 1.00
Map.containsKey avg 26.0 ns, Map.get() == null avg 25.5 ns, Ratio was 0.98
Map.containsKey avg 25.0 ns, Map.get() == null avg 24.9 ns, Ratio was 1.00
Map.containsKey avg 25.0 ns, Map.get() == null avg 24.9 ns, Ratio was 1.00
Map.containsKey avg 24.8 ns, Map.get() == null avg 25.0 ns, Ratio was 1.01

however for two million keys

但是对于两百万把钥匙

Map.containsKey avg 36.5 ns, Map.get() == null avg 36.7 ns, Ratio was 1.00
Map.containsKey avg 34.3 ns, Map.get() == null avg 35.1 ns, Ratio was 1.02
Map.containsKey avg 36.7 ns, Map.get() == null avg 35.1 ns, Ratio was 0.96
Map.containsKey avg 36.3 ns, Map.get() == null avg 35.1 ns, Ratio was 0.97
Map.containsKey avg 36.7 ns, Map.get() == null avg 35.2 ns, Ratio was 0.96

for five million keys

五百万把钥匙

Map.containsKey avg 40.1 ns, Map.get() == null avg 40.9 ns, Ratio was 1.02
Map.containsKey avg 38.6 ns, Map.get() == null avg 40.4 ns, Ratio was 1.04
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.3 ns, Ratio was 0.97
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.3 ns, Ratio was 0.98
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.8 ns, Ratio was 0.99

BTW: The time complexity for get() and containsKey is O(1) (on an idealized machine), but you can see that for a real machine, the cost increases with the size of the Map.

BTW:get() 和 containsKey 的时间复杂度为 O(1)(在理想化机器上),但您可以看到,对于真实机器,成本随着 Map 的大小而增加。