java 为什么 HashMap containsKey 比 Sun JDK 中的 get 慢?(sun-jdk-1.6.0.17)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7999546/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Why is HashMap containsKey slower than get in Sun JDK? (sun-jdk-1.6.0.17)
提问by Stefan
Why is calling containsKey
on a HashMap slower then get
?
为什么调用containsKey
HashMap 会变慢get
呢?
Test: http://ideone.com/QsWXF(>15% difference, run on sun-jdk-1.6.0.17)
测试:http: //ideone.com/QsWXF(>15% 差异,在 sun-jdk-1.6.0.17 上运行)
回答by
Because it does [ever so slightly] more work, see the OpenJDK 7 source.
因为它做了 [稍微] 更多的工作,请参阅OpenJDK 7 源代码。
Note that containsKey
calls getEntry
while get
directly "does the magic lookup". I do not know why it was done this way, and am further puzzled by the use/not use of See John B's and Ted Hopps's comments as to why this is done.getForNullKey
:
请注意,containsKey
调用getEntry
whileget
直接“进行魔术查找”。我不知道为什么要这样做,并且对使用/不使用以下内容感到更加困惑请参阅 John B 和 Ted Hopps 关于为什么要这样做的评论。getForNullKey
:
get
has an early code split for a null-key (note that get
will return null
if the entry doesn't exist or existed with a null value stored):
get
有一个空键的早期代码拆分(请注意,如果条目不存在或存在空值,get
则将返回null
):
315 if (key == null)
316 return getForNullKey();
...
322 if (e.hash == hash &&
((k = e.key) == key || key.equals(k)))
323 return e.value;
While getEntry
, called from containsKey
, does not split to getForNullKey
and there is additional work here to check for the null-key case (for each Entry scanned in the chain):
While getEntry
,从 调用containsKey
,不会拆分为getForNullKey
,这里有额外的工作来检查空键情况(对于链中扫描的每个条目):
366 if (e.hash == hash &&
367 ((k = e.key) == key || (key != null && key.equals(k))))
368 return e;
Also, containsKey
has the additional conditional and method call (note that getEntry
will return an Entry object, if said key exists, even if the stored value is null
):
此外,containsKey
还有额外的条件和方法调用(请注意getEntry
,如果所述键存在,则将返回一个 Entry 对象,即使存储的值为null
):
352 return getEntry(key) != null;
I suppose it could be argued that containsKey
would benefit - in terms of "performance" - from having a specialized form (at the expense of less-DRY code), or that getEntry
could follow the lead of get
with an early null-key check .. on the other-hand, it might be argued that get
should be written in terms of getEntry
;-)
我想可以争论说,containsKey
在“性能”方面会受益于具有专门的形式(以减少 DRY 代码为代价),或者getEntry
可以效仿get
早期的空键检查 ..另一方面,可能有人认为get
应该用getEntry
;-)
Happy coding.
快乐编码。
回答by Ted Hopp
I haven't tried to reproduce your results yet, but my first guess is that get
simply returns the value (if any) that was found, while containsKey
(which is what you tested, not contains
) needs to test whether the key exists (with either a null or non-null value) and then return a boolean. Just a little more overhead involved.
我还没有尝试重现您的结果,但我的第一个猜测是get
只返回找到的值(如果有),而containsKey
(这是您测试的,不是contains
)需要测试密钥是否存在(使用null 或非 null 值),然后返回一个布尔值。只是涉及更多的开销。
回答by user541686
Let's see the source code:
让我们看看源代码:
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
public boolean containsKey(Object key) {
return getEntry(key) != null;
}
final Entry<K,V> getEntry(Object key) {
int hash = (key == null) ? 0 : hash(key.hashCode());
for (Entry<K,V> e = table[indexFor(hash, table.length)];
e != null;
e = e.next) {
Object k;
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
return e;
}
return null;
}
Maybe it's because of the extra method call or because of the repeated check for key != null
?
也许是因为额外的方法调用或因为重复检查key != null
?
回答by Peter Lawrey
Testing different sizes of hashmaps, if there is a bias in performance, its very small.
测试不同大小的hashmap,如果性能有偏差,则很小。
Running on Java 7 update 1 with a 4.6 GHz i7 2600K.
在带有 4.6 GHz i7 2600K 的 Java 7 update 1 上运行。
public class HashMapPerfMain {
public static void main(String... args) {
Integer[] keys = generateKeys(2 * 1000 * 1000);
Map<Integer, Boolean> map = new HashMap<Integer, Boolean>();
for (int j = 0; j < keys.length; j += 2)
map.put(keys[j], true);
for (int t = 0; t < 5; t++) {
long start = System.nanoTime();
int count = countContainsKey(map, keys);
long time = System.nanoTime() - start;
assert count == keys.length / 2;
long start2 = System.nanoTime();
int count2 = countGetIsNull(map, keys);
long time2 = System.nanoTime() - start2;
assert count2 == keys.length / 2;
System.out.printf("Map.containsKey avg %.1f ns, ", (double) time / keys.length);
System.out.printf("Map.get() == null avg %.1f ns, ", (double) time2 / keys.length);
System.out.printf("Ratio was %.2f%n", (double) time2/ time);
}
}
private static int countContainsKey(Map<Integer, Boolean> map, Integer[] keys) {
int count = 0;
for (Integer i : keys) {
if (map.containsKey(i)) count++;
}
return count;
}
private static int countGetIsNull(Map<Integer, Boolean> map, Integer[] keys) {
int count = 0;
for (Integer i : keys) {
if (map.get(i) == null) count++;
}
return count;
}
private static Integer[] generateKeys(int size) {
Integer[] keys = new Integer[size];
Random random = new Random();
for (int i = 0; i < keys.length; i++)
keys[i] = random.nextInt();
return keys;
}
}
prints for half million keys
打印 50 万把钥匙
Map.containsKey avg 27.1 ns, Map.get() == null avg 26.4 ns, Ratio was 0.97
Map.containsKey avg 19.6 ns, Map.get() == null avg 19.6 ns, Ratio was 1.00
Map.containsKey avg 18.3 ns, Map.get() == null avg 19.0 ns, Ratio was 1.04
Map.containsKey avg 18.2 ns, Map.get() == null avg 19.1 ns, Ratio was 1.05
Map.containsKey avg 18.3 ns, Map.get() == null avg 19.0 ns, Ratio was 1.04
prints for one million keys
打印一百万把钥匙
Map.containsKey avg 30.9 ns, Map.get() == null avg 30.9 ns, Ratio was 1.00
Map.containsKey avg 26.0 ns, Map.get() == null avg 25.5 ns, Ratio was 0.98
Map.containsKey avg 25.0 ns, Map.get() == null avg 24.9 ns, Ratio was 1.00
Map.containsKey avg 25.0 ns, Map.get() == null avg 24.9 ns, Ratio was 1.00
Map.containsKey avg 24.8 ns, Map.get() == null avg 25.0 ns, Ratio was 1.01
however for two million keys
但是对于两百万把钥匙
Map.containsKey avg 36.5 ns, Map.get() == null avg 36.7 ns, Ratio was 1.00
Map.containsKey avg 34.3 ns, Map.get() == null avg 35.1 ns, Ratio was 1.02
Map.containsKey avg 36.7 ns, Map.get() == null avg 35.1 ns, Ratio was 0.96
Map.containsKey avg 36.3 ns, Map.get() == null avg 35.1 ns, Ratio was 0.97
Map.containsKey avg 36.7 ns, Map.get() == null avg 35.2 ns, Ratio was 0.96
for five million keys
五百万把钥匙
Map.containsKey avg 40.1 ns, Map.get() == null avg 40.9 ns, Ratio was 1.02
Map.containsKey avg 38.6 ns, Map.get() == null avg 40.4 ns, Ratio was 1.04
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.3 ns, Ratio was 0.97
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.3 ns, Ratio was 0.98
Map.containsKey avg 39.3 ns, Map.get() == null avg 38.8 ns, Ratio was 0.99
BTW: The time complexity for get() and containsKey is O(1) (on an idealized machine), but you can see that for a real machine, the cost increases with the size of the Map.
BTW:get() 和 containsKey 的时间复杂度为 O(1)(在理想化机器上),但您可以看到,对于真实机器,成本随着 Map 的大小而增加。