list 用于快速随机访问、搜索、插入和删除的高效数据结构
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/890357/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Efficient data structure for fast random access, search, insertion and deletion
提问by Leonel
I'm looking for a data structure (or structures) that would allow me keep me an ordered list of integers, no duplicates, with indexes and values in the same range.
我正在寻找一种数据结构(或结构),它可以让我保留一个有序的整数列表,没有重复,索引和值在同一范围内。
I need four main operations to be efficient, in rough order of importance:
我需要四个主要操作才能有效,按重要性的粗略顺序:
- taking the value from a given index
- finding the index of a given value
- inserting a value at a given index
- deleting a value at a given index
- 从给定索引中获取值
- 查找给定值的索引
- 在给定索引处插入一个值
- 删除给定索引处的值
Using an array I have 1 at O(1), but 2 is O(N) and insertion and deletions are expensive (O(N) as well, I believe).
使用数组我在 O(1) 处有 1,但 2 是 O(N) 并且插入和删除很昂贵(我相信也是 O(N))。
A Linked List has O(1) insertion and deletion (once you have the node), but 1 and 2 are O(N) thus negating the gains.
链表有 O(1) 次插入和删除(一旦你有了节点),但 1 和 2 是 O(N),因此否定了收益。
I tried keeping two arrays a[index]=value and b[value]=index, which turn 1 and 2 into O(1) but turn 3 and 4 into even more costly operations.
我尝试保留两个数组 a[index]=value 和 b[value]=index,它们将 1 和 2 转换为 O(1),但将 3 和 4 转换为成本更高的操作。
Is there a data structure better suited for this?
有没有更适合这个的数据结构?
回答by Ayman Hourieh
I would use a red-black treeto map keys to values. This gives you O(log(n)) for 1, 3, 4. It also maintains the keys in sorted order.
我会使用红黑树将键映射到值。这为 1、3、4 提供了 O(log(n))。它还按排序顺序维护键。
For 2, I would use a hash table to map values to keys, which gives you O(1) performance. It also adds O(1) overhead for keeping the hash table updated when adding and deleting keys in the red-black tree.
对于 2,我将使用哈希表将值映射到键,这为您提供 O(1) 性能。它还增加了 O(1) 开销,用于在红黑树中添加和删除键时保持哈希表更新。
回答by lothar
How about using a sorted array with binary search?
如何在二分搜索中使用排序数组?
Insertion and deletion is slow. but given the fact that the data are plain integers could be optimized with calls to memcpy() if you are using C or C++. If you know the maximum size of the array, you can even avoid any memory allocations during the usage of the array, as you can preallocate it to the maximum size.
插入和删除很慢。但鉴于数据是纯整数,如果您使用 C 或 C++,可以通过调用 memcpy() 进行优化。如果您知道数组的最大大小,您甚至可以在使用数组期间避免任何内存分配,因为您可以将其预先分配到最大大小。
The "best" approach depends on how many items you need to store and how often you will need to insert/delete compared to finding. If you rarely insert or delete a sorted array with O(1) access to the values is certainly better, but if you insert and delete things frequently a binary tree can be better than the array. For a small enough n the array most likely beats the tree in any case.
“最佳”方法取决于您需要存储多少项目以及与查找相比需要插入/删除的频率。如果您很少插入或删除具有 O(1) 访问值的排序数组当然更好,但如果您频繁插入和删除内容,二叉树可能比数组更好。对于足够小的 n,数组很可能在任何情况下都会击败树。
If storage size is of concern, the array is better than the trees, too. Trees also need to allocate memory for every item they store and the overhead of the memory allocation can be significant as you only store small values (integers).
如果关注存储大小,数组也比树好。树还需要为它们存储的每个项目分配内存,内存分配的开销可能很大,因为您只存储小值(整数)。
You may want to profile what is faster, the copying of the integers if you insert/delete from the sorted array or the tree with it's memory (de)allocations.
如果从排序数组或具有内存(去)分配的树中插入/删除,您可能想要分析什么更快,整数的复制。
回答by Rob Hruska
I don't know what language you're using, but if it's Java you can leverage LinkedHashMapor a similar Collection. It's got all of the benefits of a List and a Map, provides constant time for most operations, and has the memory footprint of an elephant. :)
我不知道您使用的是什么语言,但如果是 Java,您可以利用LinkedHashMap或类似的集合。它具有 List 和 Map 的所有优点,为大多数操作提供恒定时间,并且具有大象的内存占用量。:)
If you're not using Java, the idea of a LinkedHashMap is probably still suitable for a usable data structure for your problem.
如果您不使用 Java,LinkedHashMap 的想法可能仍然适用于您的问题的可用数据结构。
回答by EvilTeach
Use a vector for the array access.
使用向量进行数组访问。
Use a map as a search index to the subscript into the vector.
使用地图作为搜索索引到向量中的下标。
- given a subscript fetch the valuefrom the vector O(1)
- given a key, use the map to findthe subscript of the value. O(lnN)
- insert a value, push back on the vector O(1) amortized, insert the subscript into the map O(lnN)
- delete a value, delete from the map O(lnN)
- 给定下标从向量 O(1) 中获取值
- 给定一个键,使用映射查找值的下标。O(lnN)
- 插入一个值,推回摊销的向量 O(1),将下标插入映射 O(lnN)
- 删除一个值,从地图中删除 O(lnN)
回答by Jarek Czekalski
How to achieve 2 with RB-trees? We can make them count their children with every insert/delete operations. This doesn't make these operationis last significantly longer. Then getting down the tree to find the i-th element is possible in log n time. But I see no implementation of this method in java nor stl.
如何用 RB 树实现 2?我们可以让他们在每次插入/删除操作时计算他们的孩子。这不会使这些操作持续时间明显更长。然后在 log n 时间内沿着树向下查找第 i 个元素是可能的。但是我在 java 和 stl 中都没有看到这个方法的实现。
回答by CookieOfFortune
Howabout a Treemap? log(n) for the operations described.
树状图怎么样?log(n) 用于描述的操作。
回答by Zifre
I like balanced binary trees a lot. They are sometimes slower than hash tables or other structures, but they are much more predictable; they are generally O(log n)
for all operations. I would suggest using a Red-black treeor an AVL tree.
我非常喜欢平衡二叉树。它们有时比哈希表或其他结构慢,但它们更可预测;它们通常O(log n)
用于所有操作。我建议使用红黑树或AVL 树。
回答by BenAlabaster
If you're working in .NET, then according to the MS docs http://msdn.microsoft.com/en-us/library/f7fta44c.aspx
如果您在 .NET 中工作,则根据 MS 文档http://msdn.microsoft.com/en-us/library/f7fta44c.aspx
- SortedDictionary and SortedList both have O(log n) for retrieval
- SortedDictionary has O(log n) for insert and delete operations, whereas SortedList has O(n).
- SortedDictionary 和 SortedList 都有 O(log n) 来检索
- SortedDictionary的插入和删除操作的时间复杂度为 O(log n),而 SortedList 的时间复杂度为 O( n)。
The two differ by memory usage and speed of insertion/removal. SortedList uses less memory than SortedDictionary. If the SortedList is populated all at once from sorted data, it's faster than SortedDictionary. So it depends on the situation as to which is really the best for you.
两者的区别在于内存使用情况和插入/删除速度。SortedList 使用的内存比 SortedDictionary 少。如果 SortedList 是从排序数据中一次性填充的,则它比 SortedDictionary 快。因此,这取决于具体情况,哪种方式最适合您。
Also, your argument for the Linked List is not really valid as it might be O(1) for the insert, but you have to traverse the list to find the insertion point, so it's really not.
此外,您对链接列表的论点并不是真正有效,因为它可能是 O(1) 插入,但您必须遍历列表才能找到插入点,所以它真的不是。