C++ 你如何在一个排序的向量中插入值?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/15843525/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 19:48:41  来源:igfitidea点击:

how do you insert the value in a sorted vector?

c++sortingvectorstlinsertion-sort

提问by Igor

ALL,

全部,

This question is a continuation of this one. I think that STL misses this functionality, but it just my IMHO.

这个问题的延续这一个。我认为 STL 错过了这个功能,但它只是我的恕我直言。

Now, to the question.

现在,问题。

Consider following code:

考虑以下代码:

class Foo
{
public:
    Foo();
    int paramA, paramB;
    std::string name;
};

struct Sorter
{
    bool operator()(const Foo &foo1, const Foo &foo2) const
    {
         switch( paramSorter )
         {
             case 1:
                 return foo1.paramA < foo2.paramA;
             case 2:
                 return foo1.paramB < foo2.paramB;
             default:
                 return foo1.name < foo2.name;
         }
    }

    int paramSorter;
};

int main()
{
    std::vector<Foo> foo;
    Sorter sorter;
    sorter.paramSorter = 0;
        // fill the vector
    std::sort( foo.begin(), foo.end(), sorter );
}

At any given moment of time the vector can be re-sorted. The class also have the getter methods which are used in the sorter structure.

在任何给定的时刻,向量都可以重新排序。该类还具有用于排序器结构的 getter 方法。

What would be the most efficient way to insert a new element in the vector?

在向量中插入新元素的最有效方法是什么?

Situation I have is:

我的情况是:

I have a grid (spreadsheet), that uses the sorted vector of a class. At any given time the vector can be re-sorted and the grid will display the sorted data accordingly.

我有一个网格(电子表格),它使用类的排序向量。在任何给定时间,向量都可以重新排序,网格将相应地显示排序后的数据。

Now I will need to insert a new element in the vector/grid. I can insert, then re-sort and then re-display the whole grid, but this is very inefficient especially for the big grid.

现在我需要在向量/网格中插入一个新元素。我可以插入,然后重新排序,然后重新显示整个网格,但这效率非常低,尤其是对于大网格。

Any help would be appreciated.

任何帮助,将不胜感激。

回答by CashCow

The simple answer to the question:

问题的简单回答:

template< typename T >
typename std::vector<T>::iterator 
   insert_sorted( std::vector<T> & vec, T const& item )
{
    return vec.insert
        ( 
            std::upper_bound( vec.begin(), vec.end(), item ),
            item 
        );
}

Version with a predicate.

带有谓词的版本。

template< typename T, typename Pred >
typename std::vector<T>::iterator
    insert_sorted( std::vector<T> & vec, T const& item, Pred pred )
{
    return vec.insert
        ( 
           std::upper_bound( vec.begin(), vec.end(), item, pred ),
           item 
        );
}

Where Pred is a strictly-ordered predicate on type T.

其中 Pred 是类型 T 上的严格排序谓词。

For this to work the input vector must already be sorted on this predicate.

为此,输入向量必须已经在这个谓词上排序。

The complexity of doing this is O(log N)for the upper_boundsearch (finding where to insert) but up to O(N)for the insert itself.

在这样的复杂O(log N)upper_bound搜索(找出在哪里插入),但到O(N)了插入本身。

For a better complexity you could use std::set<T>if there are not going to be any duplicates or std::multiset<T>if there may be duplicates. These will retain a sorted order for you automatically and you can specify your own predicate on these too.

为了更好的复杂性,std::set<T>如果不会有任何重复或std::multiset<T>可能有重复,您可以使用。这些将自动为您保留排序顺序,您也可以在这些上指定自己的谓词。

There are various other things you could do which are more complex, e.g. manage a vectorand a set/ multiset/ sorted vectorof newly added items then merge these in when there are enough of them. Any kind of iterating through your collection will need to run through both collections.

您还可以做各种其他更复杂的事情,例如管理一个vector和一个set/ multiset/sorted vector新添加的项目,然后在有足够多的项目时将它们合并。任何类型的遍历您的集合都需要遍历两个集合。

Using a second vector has the advantage of keeping your data compact. Here your "newly added" items vectorwill be relatively small so the insertion time will be O(M)where Mis the size of this vector and might be more feasible than the O(N)of inserting in the big vector every time. The merge would be O(N+M)which is better than O(NM)it would be inserting one at a time, so in total it would be O(N+M) + O(M2)to insert Melements then merge.

使用第二个向量具有保持数据紧凑的优势。在这里你的“新增”的项目vector会比较小,因此,在插入时间将是O(M)在那里M是这个向量的大小,可能会比更可行O(N)每次在大载体插入的。合并会O(N+M)O(NM)一次插入一个更好,所以总的来说它是O(N+M) + O(M2)插入M元素然后合并。

You would probably keep the insertion vector at its capacity too, so as you grow that you will not be doing any reallocations, just moving of elements.

您可能也会保持插入向量的容量,因此随着您的成长,您将不会进行任何重新分配,而只是移动元素。

回答by Andy Prowl

If you need to keep the vector sorted all the time, first you might consider whether using std::setor std::multisetwon't simplify your code.

如果您需要一直保持向量排序,首先您可能会考虑是否使用std::setstd::multiset不会简化您的代码。

If you really need a sorted vector and want to quickly insert an element into it, but do not want to enforce a sorting criterion to be satisfied all the time, then you can first use std::lower_bound()to find the position in a sorted range where the element should be inserted in logarithmic time, then use the insert()member function of vectorto insert the element at that position.

如果你真的需要一个已排序的向量,并且想要快速地向其中插入一个元素,但又不想强制执行一个排序条件一直满足,那么你可以先使用std::lower_bound()查找元素应该在排序范围内的位置以对数时间插入,然后使用的insert()成员函数vector在该位置插入元素。

If performance is an issue, consider benchmarking std::listvs std::vector. For small items, std::vectoris known to be faster because of a higher cache hit rate, but the insert()operation itself is computationally faster on lists (no need to move elements around).

如果性能是一个问题,请考虑基准测试std::liststd::vector. 对于小项目,std::vector由于更高的缓存命中率而insert()更快,但操作本身在列表上的计算速度更快(无需移动元素)。

回答by Brian Rodriguez

Just a note, you can use upper_boundas well depending on your needs. upper_boundwill assure new entries that are equivalent to others will appear at the endof their sequence, lower_boundwill assure new entries equivalent to others will appear at the beginningof their sequence. Can be useful for certain implementations (maybe classes that can share a "position" but not all of their details!)

请注意,您也可以upper_bound根据需要使用。upper_bound将确保与其他条目等效的新条目将出现在其序列的末尾lower_bound将确保与其他条目等效的新条目将出现在其序列的开头。对某些实现很有用(也许类可以共享“位置”但不是所有细节!)

Bothwill assure you that the vector remains sorted according to <result of elements, although inserting into lower_boundwill mean moving more elements.

两者都将向您保证向量仍然根据<元素的结果进行排序,尽管插入到lower_bound将意味着移动更多元素。

Example:

例子:

insert 7 @ lower_bound of { 5, 7, 7, 9 } => { 5, *7*, 7, 7, 9 }
insert 7 @ upper_bound of { 5, 7, 7, 9 } => { 5, 7, 7, *7*, 9 }

回答by Brian Rodriguez

Instead of inserting and sorting. You should do a find and then insert

而不是插入和排序。你应该做一个查找然后插入

Keep the vector sorted. (sort once). When you have to insert

保持向量排序。(排序一次)。当你必须插入

  1. find the first element that compares as greater to the one you are going to insert.

  2. Do an insert just before that position.

  1. 找到与您要插入的元素比较大的第一个元素。

  2. 在那个位置之前做一个插入。

This way the vector stays sorted.

这样向量保持排序。

Here is an example of how it goes.

这是一个如何进行的示例。

start {} empty vector

insert 1 -> find first greater returns end() = 1 -> insert at 1 -> {1}
insert 5 -> find first greater returns end() = 2 -> insert at 2 -> {1,5}
insert 3 -> find first greater returns 2 -> insert at 2 -> {1,3,5}
insert 4 -> find first greater returns 3 -> insert at 3 -> {1,3,4,5}

回答by Sebastian

When you want to switch between sort orders, you can use multiple index datastructures, each of which you keep in sorted order (probably some kind of balanced tree, like std::map, which maps sort-keys to vector-indices, or std::set to store pointers to youre obects - but with different comparison functions).

当您想在排序顺序之间切换时,您可以使用多个索引数据结构,每个索引数据结构都按排序顺序(可能是某种平衡树,如 std::map,它将排序键映射到向量索引,或 std ::set 以存储指向您的对象的指针 - 但具有不同的比较函数)。

Here's a library which does this: http://www.boost.org/doc/libs/1_53_0/libs/multi_index/doc/index.html

这是一个执行此操作的库:http: //www.boost.org/doc/libs/1_53_0/libs/multi_index/doc/index.html

For every change (insert of new elements or update of keys) you must update all index datastructure, or flag them as invalid.

对于每次更改(插入新元素或更新键),您必须更新所有索引数据结构,或将它们标记为无效。

This works if there are not "too many" sort orders and not "too many" updates of your datastructure. Otherwise - bad luck, you have to re-sort everytime you want to change the order.

如果您的数据结构没有“太多”排序顺序并且没有“太多”更新,则此方法有效。否则 - 运气不好,每次要更改顺序时都必须重新排序。

In other words: The more indices you need (to speed up lookup operations), the more time you need for update operations. And every index needs memory, of course.

换句话说:您需要的索引越多(以加快查找操作),更新操作所需的时间就越多。当然,每个索引都需要内存。

To keep the count of indices small, you could use some query engine which combines the indices of several fields to support more complex sort orders over several fields. Like an SQL query optimizer. But that may be overkill...

为了保持索引的数量较少,您可以使用一些查询引擎,该引擎组合多个字段的索引以支持多个字段的更复杂的排序顺序。就像 SQL 查询优化器。但这可能有点矫枉过正……

Example: If you have two fields, a and b, you can support 4 sort orders:

示例:如果您有两个字段 a 和 b,则可以支持 4 种排序顺序:

  1. a
  2. b
  3. first a then b
  4. first b then a
  1. 一种
  2. 先a然后b
  3. 先b然后a

with 2 indices (3. and 4.). With more fields, the possible combinations of sort orders gets big, fast. But you can still use an index which sorts "almost as you want it" and, during the query, sort the remaining fields you couldn't catch with that index, as needed. For sorted output of the whole data, this doesn't help much. But if you only want to lookup some elements, the first "narrowing down" can help much.

有 2 个索引(3. 和 4.)。有了更多的字段,排序顺序的可能组合就会变大、变快。但是您仍然可以使用“几乎按照您的需要”排序的索引,并且在查询期间,根据需要对您无法使用该索引捕获的其余字段进行排序。对于整个数据的排序输出,这没有多大帮助。但是如果你只想查找一些元素,第一个“缩小范围”会有很大帮助。

回答by Sebastian

Assuming you really want to use a vector, and the sort criterium or keys don't change (so the order of already inserted elements always stays the same): Insert the element at the end, then move it to the front one step at a time, until the preceeding element isn't bigger.

假设您确实要使用向量,并且排序条件或键不会改变(因此已插入元素的顺序始终保持不变):在末尾插入元素,然后将其移到最前面一步时间,直到前面的元素不更大。

It can't be done faster (regarding asymptotic complexity, or "big O notation"), because you must move all bigger elements. And that's the reason why STL doesn't provide this - because it's inefficient on vectors, and you shouldn't use them if you need it.

它不能做得更快(关于渐近复杂性或“大 O 符号”),因为您必须移动所有更大的元素。这就是 STL 不提供此功能的原因 - 因为它在向量上效率低下,如果需要,您不应该使用它们。

Edit: Another assumption: Comparing the elements is not much more expensive than moving them. See comments.

编辑:另一个假设:比较元素并不比移动它们昂贵多少。看评论。

Edit 2: As my first assumption doesn't hold (you want to change the sort criterium), scrap this answer and see my other one: https://stackoverflow.com/a/15843955/1413374

编辑 2:由于我的第一个假设不成立(您想更改排序标准),请取消此答案并查看我的另一个答案:https: //stackoverflow.com/a/15843955/1413374