您将如何在 Java 中实现 LRU 缓存?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/221525/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 11:37:42  来源:igfitidea点击:

How would you implement an LRU cache in Java?

javacachingdata-structureslru

提问by Hank Gay

Please don't say EHCache or OSCache, etc. Assume for purposes of this question that I want to implement my own using just the SDK (learning by doing). Given that the cache will be used in a multithreaded environment, which datastructures would you use? I've already implemented one using LinkedHashMapand Collections#synchronizedMap, but I'm curious if any of the new concurrent collections would be better candidates.

请不要说 EHCache 或 OSCache 等。假设出于这个问题的目的,我想仅使用 SDK 来实现我自己的(边做边学)。鉴于缓存将用于多线程环境,您会使用哪种数据结构?我已经使用LinkedHashMapCollections#synchronizedMap实现了一个,但我很好奇是否有任何新的并发集合是更好的候选者。

UPDATE: I was just reading through Yegge's latestwhen I found this nugget:

更新:当我发现这个金块时,我正在阅读Yegge 的最新文章

If you need constant-time access and want to maintain the insertion order, you can't do better than a LinkedHashMap, a truly wonderful data structure. The only way it could possibly be more wonderful is if there were a concurrent version. But alas.

如果您需要恒定时间访问并想要维护插入顺序,那么您不能比 LinkedHashMap 做得更好,这是一个真正美妙的数据结构。它可能更精彩的唯一方法是如果有一个并发版本。可惜。

I was thinking almost exactly the same thing before I went with the LinkedHashMap+ Collections#synchronizedMapimplementation I mentioned above. Nice to know I hadn't just overlooked something.

在我使用上面提到的LinkedHashMap+Collections#synchronizedMap实现之前,我正在考虑几乎完全相同的事情。很高兴知道我没有忽略一些东西。

Based on the answers so far, it sounds like my best bet for a highly concurrent LRU would be to extend ConcurrentHashMapusing some of the same logic that LinkedHashMapuses.

根据目前的答案,对于高并发 LRU 来说,我最好的选择是使用一些相同的逻辑扩展ConcurrentHashMapLinkedHashMap

采纳答案by Hank Gay

If I were doing this again from scratch today, I'd use Guava's CacheBuilder.

如果我今天从头开始再次这样做,我会使用番石榴的CacheBuilder.

回答by Steve McLeod

I would consider using java.util.concurrent.PriorityBlockingQueue, with priority determined by a "numberOfUses" counter in each element. I would be very, very carefulto get all my synchronisation correct, as the "numberOfUses" counter implies that the element can't be immutable.

我会考虑使用java.util.concurrent.PriorityBlockingQueue,优先级由每个元素中的“numberOfUses”计数器确定。我会非常非常小心地使所有同步正确,因为“numberOfUses”计数器暗示该元素不能是不可变的。

The element object would be a wrapper for the objects in the cache:

元素对象将是缓存中对象的包装器:

class CacheElement {
    private final Object obj;
    private int numberOfUsers = 0;

    CacheElement(Object obj) {
        this.obj = obj;
    }

    ... etc.
}

回答by madlep

Have a look at ConcurrentSkipListMap. It should give you log(n) time for testing and removing an element if it is already contained in the cache, and constant time for re-adding it.

看看ConcurrentSkipListMap。如果它已经包含在缓存中,它应该给你 log(n) 时间来测试和删除一个元素,以及重新添加它的恒定时间。

You'd just need some counter etc and wrapper element to force ordering of the LRU order and ensure recent stuff is discarded when the cache is full.

您只需要一些计数器等和包装元素来强制对 LRU 顺序进行排序,并确保在缓存已满时丢弃最近的内容。

回答by luke

Well for a cache you will generally be looking up some piece of data via a proxy object, (a URL, String....) so interface-wise you are going to want a map. but to kick things out you want a queue like structure. Internally I would maintain two data structures, a Priority-Queue and a HashMap. heres an implementation that should be able to do everything in O(1) time.

好吧,对于缓存,您通常会通过代理对象(URL、字符串...)查找一些数据,因此在界面方面您将需要地图。但是要把事情踢出去,你需要一个类似结构的队列。在内部,我会维护两个数据结构,一个 Priority-Queue 和一个 HashMap。这是一个应该能够在 O(1) 时间内完成所有事情的实现。

Here's a class I whipped up pretty quick:

这是我很快就想到的一门课:

import java.util.HashMap;
import java.util.Map;
public class LRUCache<K, V>
{
    int maxSize;
    int currentSize = 0;

    Map<K, ValueHolder<K, V>> map;
    LinkedList<K> queue;

    public LRUCache(int maxSize)
    {
        this.maxSize = maxSize;
        map = new HashMap<K, ValueHolder<K, V>>();
        queue = new LinkedList<K>();
    }

    private void freeSpace()
    {
        K k = queue.remove();
        map.remove(k);
        currentSize--;
    }

    public void put(K key, V val)
    {
        while(currentSize >= maxSize)
        {
            freeSpace();
        }
        if(map.containsKey(key))
        {//just heat up that item
            get(key);
            return;
        }
        ListNode<K> ln = queue.add(key);
        ValueHolder<K, V> rv = new ValueHolder<K, V>(val, ln);
        map.put(key, rv);       
        currentSize++;
    }

    public V get(K key)
    {
        ValueHolder<K, V> rv = map.get(key);
        if(rv == null) return null;
        queue.remove(rv.queueLocation);
        rv.queueLocation = queue.add(key);//this ensures that each item has only one copy of the key in the queue
        return rv.value;
    }
}

class ListNode<K>
{
    ListNode<K> prev;
    ListNode<K> next;
    K value;
    public ListNode(K v)
    {
        value = v;
        prev = null;
        next = null;
    }
}

class ValueHolder<K,V>
{
    V value;
    ListNode<K> queueLocation;
    public ValueHolder(V value, ListNode<K> ql)
    {
        this.value = value;
        this.queueLocation = ql;
    }
}

class LinkedList<K>
{
    ListNode<K> head = null;
    ListNode<K> tail = null;

    public ListNode<K> add(K v)
    {
        if(head == null)
        {
            assert(tail == null);
            head = tail = new ListNode<K>(v);
        }
        else
        {
            tail.next = new ListNode<K>(v);
            tail.next.prev = tail;
            tail = tail.next;
            if(tail.prev == null)
            {
                tail.prev = head;
                head.next = tail;
            }
        }
        return tail;
    }

    public K remove()
    {
        if(head == null)
            return null;
        K val = head.value;
        if(head.next == null)
        {
            head = null;
            tail = null;
        }
        else
        {
            head = head.next;
            head.prev = null;
        }
        return val;
    }

    public void remove(ListNode<K> ln)
    {
        ListNode<K> prev = ln.prev;
        ListNode<K> next = ln.next;
        if(prev == null)
        {
            head = next;
        }
        else
        {
            prev.next = next;
        }
        if(next == null)
        {
            tail = prev;
        }
        else
        {
            next.prev = prev;
        }       
    }
}

Here's how it works. Keys are stored in a linked list with the oldest keys in the front of the list (new keys go to the back) so when you need to 'eject' something you just pop it off the front of the queue and then use the key to remove the value from the map. When an item gets referenced you grab the ValueHolder from the map and then use the queuelocation variable to remove the key from its current location in the queue and then put it at the back of the queue (its now the most recently used). Adding things is pretty much the same.

这是它的工作原理。键存储在一个链表中,最旧的键在列表的前面(新键在后面),所以当你需要“弹出”某些东西时,你只需将它从队列的前面弹出,然后使用键来从地图中删除值。当一个项目被引用时,你从地图中获取 ValueHolder,然后使用 queuelocation 变量从队列中的当前位置删除键,然后将它放在队列的后面(它现在是最近使用的)。添加东西几乎是一样的。

I'm sure theres a ton of errors here and I haven't implemented any synchronization. but this class will provide O(1) adding to the cache, O(1) removal of old items, and O(1) retrieval of cache items. Even a trivial synchronization (just synchronize every public method) would still have little lock contention due to the run time. If anyone has any clever synchronization tricks I would be very interested. Also, I'm sure there are some additional optimizations that you could implement using the maxsize variable with respect to the map.

我确定这里有很多错误,我还没有实现任何同步。但是这个类将提供 O(1) 添加到缓存、O(1) 删除旧项目和 O(1) 检索缓存项目。由于运行时的原因,即使是微不足道的同步(只是同步每个公共方法)仍然很少有锁争用。如果有人有任何巧妙的同步技巧,我会非常感兴趣。此外,我确信您可以使用 maxsize 变量对地图进行一些额外的优化。

回答by luke

LinkedHashMapis O(1), but requires synchronization. No need to reinvent the wheel there.

LinkedHashMap是 O(1),但需要同步。无需在那里重新发明轮子。

2 options for increasing concurrency:

增加并发性的 2 个选项:

1. Create multiple LinkedHashMap, and hash into them: example: LinkedHashMap[4], index 0, 1, 2, 3. On the key do key%4(or binary ORon [key, 3]) to pick which map to do a put/get/remove.

1. 创建多个LinkedHashMap,并散列到它们中:例如:LinkedHashMap[4], index 0, 1, 2, 3。在键 do key%4(或binary ORon [key, 3])上选择要执行放置/获取/删除的地图。

2. You could do an 'almost' LRU by extending ConcurrentHashMap, and having a linked hash map like structure in each of the regions inside of it. Locking would occur more granularly than a LinkedHashMapthat is synchronized. On a putor putIfAbsentonly a lock on the head and tail of the list is needed (per region). On a remove or get the whole region needs to be locked. I'm curious if Atomic linked lists of some sort might help here -- probably so for the head of the list. Maybe for more.

2. 你可以通过扩展来做一个“几乎”的 LRU ConcurrentHashMap,并在它内部的每个区域中都有一个类似结构的链接哈希映射。锁定会比LinkedHashMap同步的更精细。在列表的头部和尾部需要一个putputIfAbsent仅一个锁定(每个区域)。在删除或获取时需要锁定整个区域。我很好奇某种原子链表是否可能对这里有所帮助——对于列表的头部来说可能是这样。也许更多。

The structure would not keep the total order, but only the order per region. As long as the number of entries is much larger than the number of regions, this is good enough for most caches. Each region will have to have its own entry count, this would be used rather than the global count for the eviction trigger. The default number of regions in a ConcurrentHashMapis 16, which is plenty for most servers today.

该结构不会保留总顺序,而只会保留每个区域的顺序。只要条目数远大于区域数,这对于大多数缓存来说就足够了。每个区域都必须有自己的条目计数,这将用于驱逐触发器的全局计数。a 中的默认区域数ConcurrentHashMap是 16,这对于当今大多数服务器来说已经足够了。

  1. would be easier to write and faster under moderate concurrency.

  2. would be more difficult to write but scale much better at very high concurrency. It would be slower for normal access (just as ConcurrentHashMapis slower than HashMapwhere there is no concurrency)

  1. 在中等并发下会更容易编写和更快。

  2. 编写起来会更困难,但在非常高的并发性下可扩展性更好。正常访问会更慢(就像在没有并发的情况下一样ConcurrentHashMapHashMap

回答by Hank Gay

I like lots of these suggestions, but for now I think I'll stick with LinkedHashMap+ Collections.synchronizedMap. If I do revisit this in the future, I'll probably work on extending ConcurrentHashMapin the same way LinkedHashMapextends HashMap.

我喜欢很多这些建议,但现在我想我会坚持使用LinkedHashMap+ Collections.synchronizedMap。如果我将来重新审视这个,我可能会ConcurrentHashMap以同样的方式LinkedHashMap扩展 extends HashMap

UPDATE:

更新:

By request, here's the gist of my current implementation.

根据要求,这是我当前实现的要点。

private class LruCache<A, B> extends LinkedHashMap<A, B> {
    private final int maxEntries;

    public LruCache(final int maxEntries) {
        super(maxEntries + 1, 1.0f, true);
        this.maxEntries = maxEntries;
    }

    /**
     * Returns <tt>true</tt> if this <code>LruCache</code> has more entries than the maximum specified when it was
     * created.
     *
     * <p>
     * This method <em>does not</em> modify the underlying <code>Map</code>; it relies on the implementation of
     * <code>LinkedHashMap</code> to do that, but that behavior is documented in the JavaDoc for
     * <code>LinkedHashMap</code>.
     * </p>
     *
     * @param eldest
     *            the <code>Entry</code> in question; this implementation doesn't care what it is, since the
     *            implementation is only dependent on the size of the cache
     * @return <tt>true</tt> if the oldest
     * @see java.util.LinkedHashMap#removeEldestEntry(Map.Entry)
     */
    @Override
    protected boolean removeEldestEntry(final Map.Entry<A, B> eldest) {
        return super.size() > maxEntries;
    }
}

Map<String, String> example = Collections.synchronizedMap(new LruCache<String, String>(CACHE_SIZE));

回答by Ron

There are two open source implementations.

有两种开源实现。

Apache Solr has ConcurrentLRUCache: https://lucene.apache.org/solr/3_6_1/org/apache/solr/util/ConcurrentLRUCache.html

Apache Solr 有 ConcurrentLRUCache:https: //lucene.apache.org/solr/3_6_1/org/apache/solr/util/ConcurrentLRUCache.html

There's an open source project for a ConcurrentLinkedHashMap: http://code.google.com/p/concurrentlinkedhashmap/

ConcurrentLinkedHashMap 有一个开源项目:http: //code.google.com/p/concurrentlinkedhashmap/

回答by Raj Pandian

I'm looking for a better LRU cache using Java code. Is it possible for you to share your Java LRU cache code using LinkedHashMapand Collections#synchronizedMap? Currently I'm using LRUMap implements Mapand the code works fine, but I'm getting ArrayIndexOutofBoundExceptionon load testing using 500 users on the below method. The method moves the recent object to front of the queue.

我正在寻找使用 Java 代码的更好的 LRU 缓存。您是否可以使用LinkedHashMap和共享您的 Java LRU 缓存代码Collections#synchronizedMap?目前我正在使用LRUMap implements Map并且代码工作正常,但我正在ArrayIndexOutofBoundException使用 500 个用户对以下方法进行负载测试。该方法将最近的对象移动到队列的前面。

private void moveToFront(int index) {
        if (listHead != index) {
            int thisNext = nextElement[index];
            int thisPrev = prevElement[index];
            nextElement[thisPrev] = thisNext;
            if (thisNext >= 0) {
                prevElement[thisNext] = thisPrev;
            } else {
                listTail = thisPrev;
            }
            //old listHead and new listHead say new is 1 and old was 0 then prev[1]= 1 is the head now so no previ so -1
            // prev[0 old head] = new head right ; next[new head] = old head
            prevElement[index] = -1;
            nextElement[index] = listHead;
            prevElement[listHead] = index;
            listHead = index;
        }
    }

get(Object key)and put(Object key, Object value)method calls the above moveToFrontmethod.

get(Object key)put(Object key, Object value)方法调用上述moveToFront方法。

回答by Zoltan Boda

Here is my short implementation, please criticize or improve it!

这是我的简短实现,请批评或改进它!

package util.collection;

import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ConcurrentLinkedQueue;

/**
 * Limited size concurrent cache map implementation.<br/>
 * LRU: Least Recently Used.<br/>
 * If you add a new key-value pair to this cache after the maximum size has been exceeded,
 * the oldest key-value pair will be removed before adding.
 */

public class ConcurrentLRUCache<Key, Value> {

private final int maxSize;
private int currentSize = 0;

private ConcurrentHashMap<Key, Value> map;
private ConcurrentLinkedQueue<Key> queue;

public ConcurrentLRUCache(final int maxSize) {
    this.maxSize = maxSize;
    map = new ConcurrentHashMap<Key, Value>(maxSize);
    queue = new ConcurrentLinkedQueue<Key>();
}

private synchronized void freeSpace() {
    Key key = queue.poll();
    if (null != key) {
        map.remove(key);
        currentSize = map.size();
    }
}

public void put(Key key, Value val) {
    if (map.containsKey(key)) {// just heat up that item
        put(key, val);
        return;
    }
    while (currentSize >= maxSize) {
        freeSpace();
    }
    synchronized(this) {
        queue.add(key);
        map.put(key, val);
        currentSize++;
    }
}

public Value get(Key key) {
    return map.get(key);
}
}

回答by Zoltan Boda

Here is my tested best performing concurrent LRU cache implementation without any synchronized block:

这是我测试过的最好的并发 LRU 缓存实现,没有任何同步块:

public class ConcurrentLRUCache<Key, Value> {

private final int maxSize;

private ConcurrentHashMap<Key, Value> map;
private ConcurrentLinkedQueue<Key> queue;

public ConcurrentLRUCache(final int maxSize) {
    this.maxSize = maxSize;
    map = new ConcurrentHashMap<Key, Value>(maxSize);
    queue = new ConcurrentLinkedQueue<Key>();
}

/**
 * @param key - may not be null!
 * @param value - may not be null!
 */
public void put(final Key key, final Value value) {
    if (map.containsKey(key)) {
        queue.remove(key); // remove the key from the FIFO queue
    }

    while (queue.size() >= maxSize) {
        Key oldestKey = queue.poll();
        if (null != oldestKey) {
            map.remove(oldestKey);
        }
    }
    queue.add(key);
    map.put(key, value);
}

/**
 * @param key - may not be null!
 * @return the value associated to the given key or null
 */
public Value get(final Key key) {
    return map.get(key);
}

}

}

回答by Daimon

Here's my own implementation to this problem

这是我自己对这个问题的实现

simplelrucache provides threadsafe, very simple, non-distributed LRU caching with TTL support. It provides two implementations:

simplelrucache 提供线程安全的、非常简单的、非分布式 LRU 缓存,支持 TTL。它提供了两种实现:

  • Concurrent based on ConcurrentLinkedHashMap
  • Synchronized based on LinkedHashMap
  • 基于 ConcurrentLinkedHashMap 的并发
  • 基于LinkedHashMap同步

You can find it here: http://code.google.com/p/simplelrucache/

你可以在这里找到它:http: //code.google.com/p/simplelrucache/