C++ 排序和跟踪索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1577475/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 20:30:32  来源:igfitidea点击:

C++ sorting and keeping track of indexes

c++sortingstlindexing

提问by Mingus

Using C++, and hopefully the standard library, I want to sort a sequence of samples in ascending order, but I also want to remember the original indexes of the newly samples.

使用 C++,希望使用标准库,我想按升序对一系列样本进行排序,但我也想记住新样本的原始索引。

For example, I have a set, or vector, or matrix of samples A : [5, 2, 1, 4, 3]. I want to sort these to be B : [1,2,3,4,5], but I also want to remember the original indexes of the values, so I can get another set which would be: C : [2, 1, 4, 3, 0 ]- which corresponds to the index of the each element in 'B', in the original 'A'.

例如,我有一组样本、向量或矩阵A : [5, 2, 1, 4, 3]。我想将这些排序为 B : [1,2,3,4,5],但我也想记住值的原始索引,因此我可以获得另一个集合: C : [2, 1, 4, 3, 0 ]- 对应于原始 'B' 中每个元素的索引一种'。

For example, in Matlab you can do:

例如,在 Matlab 中,您可以执行以下操作:

 [a,b]=sort([5, 8, 7])
 a = 5 7 8
 b = 1 3 2

Can anyone see a good way to do this?

任何人都可以看到这样做的好方法吗?

回答by ?ukasz Wiklendt

Using C++11 lambdas:

使用C++11 个 lambda:

#include <iostream>
#include <vector>
#include <numeric>      // std::iota
#include <algorithm>    // std::sort, std::stable_sort

using namespace std;

template <typename T>
vector<size_t> sort_indexes(const vector<T> &v) {

  // initialize original index locations
  vector<size_t> idx(v.size());
  iota(idx.begin(), idx.end(), 0);

  // sort indexes based on comparing values in v
  // using std::stable_sort instead of std::sort
  // to avoid unnecessary index re-orderings
  // when v contains elements of equal values 
  stable_sort(idx.begin(), idx.end(),
       [&v](size_t i1, size_t i2) {return v[i1] < v[i2];});

  return idx;
}

Now you can use the returned index vector in iterations such as

现在您可以在迭代中使用返回的索引向量,例如

for (auto i: sort_indexes(v)) {
  cout << v[i] << endl;
}

You can also choose to supply your original index vector, sort function, comparator, or automatically reorder v in the sort_indexes function using an extra vector.

您还可以选择提供原始索引向量、排序函数、比较器,或使用额外向量在 sort_indexes 函数中自动重新排序 v。

回答by RAC

You could sort std::pair instead of just ints - first int is original data, second int is original index. Then supply a comparator that only sorts on the first int. Example:

您可以对 std::pair 进行排序,而不仅仅是 ints - 第一个 int 是原始数据,第二个 int 是原始索引。然后提供一个仅对第一个 int 进行排序的比较器。例子:

Your problem instance: v = [5 7 8]
New problem instance: v_prime = [<5,0>, <8,1>, <7,2>]

Sort the new problem instance using a comparator like:

使用比较器对新问题实例进行排序,例如:

typedef std::pair<int,int> mypair;
bool comparator ( const mypair& l, const mypair& r)
   { return l.first < r.first; }
// forgetting the syntax here but intent is clear enough

The result of std::sort on v_prime, using that comparator, should be:

使用该比较器对 v_prime 进行 std::sort 的结果应该是:

v_prime = [<5,0>, <7,2>, <8,1>]

You can peel out the indices by walking the vector, grabbing .second from each std::pair.

您可以通过遍历向量来剥离索引,从每个 std::pair 中获取 .second 。

回答by MysticForce

Suppose Given vector is

假设给定向量是

A=[2,4,3]

Create a new vector

创建一个新的向量

V=[0,1,2] // indicating positions

Sort V and while sorting instead of comparing elements of V , compare corresponding elements of A

排序 V 并在排序而不是比较 V 的元素时,比较 A 的相应元素

 //Assume A is a given vector with N elements
 vector<int> V(N);
 int x=0;
 std::iota(V.begin(),V.end(),x++); //Initializing
 sort( V.begin(),V.end(), [&](int i,int j){return A[i]<A[j];} );

回答by Aditya Aswal

vector<pair<int,int> >a;

for (i = 0 ;i < n ; i++) {
    // filling the original array
    cin >> k;
    a.push_back (make_pair (k,i)); // k = value, i = original index
}

sort (a.begin(),a.end());

for (i = 0 ; i < n ; i++){
    cout << a[i].first << " " << a[i].second << "\n";
}

Now acontains both both our values and their respective indices in the sorted.

现在a在 sorted 中包含我们的值和它们各自的索引。

a[i].first = valueat i'th.

a[i].first = valuei'th。

a[i].second = idxin initial array.

a[i].second = idx在初始数组中。

回答by hkyi

I wrote generic version of index sort.

我写了索引排序的通用版本。

template <class RAIter, class Compare>
void argsort(RAIter iterBegin, RAIter iterEnd, Compare comp, 
    std::vector<size_t>& indexes) {

    std::vector< std::pair<size_t,RAIter> > pv ;
    pv.reserve(iterEnd - iterBegin) ;

    RAIter iter ;
    size_t k ;
    for (iter = iterBegin, k = 0 ; iter != iterEnd ; iter++, k++) {
        pv.push_back( std::pair<int,RAIter>(k,iter) ) ;
    }

    std::sort(pv.begin(), pv.end(), 
        [&comp](const std::pair<size_t,RAIter>& a, const std::pair<size_t,RAIter>& b) -> bool 
        { return comp(*a.second, *b.second) ; }) ;

    indexes.resize(pv.size()) ;
    std::transform(pv.begin(), pv.end(), indexes.begin(), 
        [](const std::pair<size_t,RAIter>& a) -> size_t { return a.first ; }) ;
}

Usage is the same as that of std::sort except for an index container to receive sorted indexes. testing:

除了用于接收排序索引的索引容器外,用法与 std::sort 相同。测试:

int a[] = { 3, 1, 0, 4 } ;
std::vector<size_t> indexes ;
argsort(a, a + sizeof(a) / sizeof(a[0]), std::less<int>(), indexes) ;
for (size_t i : indexes) printf("%d\n", int(i)) ;

you should get 2 1 0 3. for the compilers without c++0x support, replace the lamba expression as a class template:

你应该得到 2 1 0 3. 对于不支持 c++0x 的编译器,将 Lamba 表达式替换为类模板:

template <class RAIter, class Compare> 
class PairComp {
public:
  Compare comp ;
  PairComp(Compare comp_) : comp(comp_) {}
  bool operator() (const std::pair<size_t,RAIter>& a, 
    const std::pair<size_t,RAIter>& b) const { return comp(*a.second, *b.second) ; }        
} ;

and rewrite std::sort as

并将 std::sort 重写为

std::sort(pv.begin(), pv.end(), PairComp(comp)()) ;

回答by behzad.nouri

I came across this question, and figured out sorting the iterators directly would be a way to sort the values and keep track of indices; There is no need to define an extra container of pairs of ( value, index ) which is helpful when the values are large objects; The iterators provides the access to both the value and the index:

我遇到了这个问题,并发现直接对迭代器进行排序是一种对值进行排序并跟踪索引的方法;不需要定义一个额外的pairs of ( value, index )容器,这在值是大对象时很有帮助;迭代器提供对值和索引的访问:

/*
 * a function object that allows to compare
 * the iterators by the value they point to
 */
template < class RAIter, class Compare >
class IterSortComp
{
    public:
        IterSortComp ( Compare comp ): m_comp ( comp ) { }
        inline bool operator( ) ( const RAIter & i, const RAIter & j ) const
        {
            return m_comp ( * i, * j );
        }
    private:
        const Compare m_comp;
};

template <class INIter, class RAIter, class Compare>
void itersort ( INIter first, INIter last, std::vector < RAIter > & idx, Compare comp )
{ 
    idx.resize ( std::distance ( first, last ) );
    for ( typename std::vector < RAIter >::iterator j = idx.begin( ); first != last; ++ j, ++ first )
        * j = first;

    std::sort ( idx.begin( ), idx.end( ), IterSortComp< RAIter, Compare > ( comp ) );
}

as for the usage example:

至于用法示例:

std::vector < int > A ( n );

// populate A with some random values
std::generate ( A.begin( ), A.end( ), rand );

std::vector < std::vector < int >::const_iterator > idx;
itersort ( A.begin( ), A.end( ), idx, std::less < int > ( ) );

now, for example, the 5th smallest element in the sorted vector would have value **idx[ 5 ]and its index in the original vector would be distance( A.begin( ), *idx[ 5 ] )or simply *idx[ 5 ] - A.begin( ).

现在,例如,排序向量中的第 5 个最小元素将具有值**idx[ 5 ],其在原始向量中的索引将是distance( A.begin( ), *idx[ 5 ] )*idx[ 5 ] - A.begin( )

回答by Ulrich Eckhardt

There is another way to solve this, using a map:

还有另一种方法可以解决这个问题,使用地图:

vector<double> v = {...}; // input data
map<double, unsigned> m; // mapping from value to its index
for (auto it = v.begin(); it != v.end(); ++it)
    m[*it] = it - v.begin();

This will eradicate non-unique elements though. If that's not acceptable, use a multimap:

不过,这将消除非唯一元素。如果这是不可接受的,请使用多图:

vector<double> v = {...}; // input data
multimap<double, unsigned> m; // mapping from value to its index
for (auto it = v.begin(); it != v.end(); ++it)
    m.insert(make_pair(*it, it - v.begin()));

In order to output the indices, iterate over the map or multimap:

为了输出索引,迭代映射或多映射:

for (auto it = m.begin(); it != m.end(); ++it)
    cout << it->second << endl;

回答by sigvaldm

Beautiful solution by @Lukasz Wiklendt! Although in my case I needed something more generic so I modified it a bit:

@Lukasz Wiklendt 的美丽解决方案!虽然就我而言,我需要更通用的东西,所以我稍微修改了一下:

template <class RAIter, class Compare>
vector<size_t> argSort(RAIter first, RAIter last, Compare comp) {

  vector<size_t> idx(last-first);
  iota(idx.begin(), idx.end(), 0);

  auto idxComp = [&first,comp](size_t i1, size_t i2) {
      return comp(first[i1], first[i2]);
  };

  sort(idx.begin(), idx.end(), idxComp);

  return idx;
}

Example: Find indices sorting a vector of strings by length, except for the first element which is a dummy.

示例:查找按长度对字符串向量进行排序的索引,除了第一个元素是虚拟元素。

vector<string> test = {"dummy", "a", "abc", "ab"};

auto comp = [](const string &a, const string& b) {
    return a.length() > b.length();
};

const auto& beginIt = test.begin() + 1;
vector<size_t> ind = argSort(beginIt, test.end(), comp);

for(auto i : ind)
    cout << beginIt[i] << endl;

prints:

印刷:

abc
ab
a

回答by aafulei

Consider using std::multimapas suggested by @Ulrich Eckhardt. Just that the code could be made even simpler.

考虑std::multimap按照@Ulrich Eckhardt 的建议使用。只是代码可以变得更简单。

Given

给定的

std::vector<int> a = {5, 2, 1, 4, 3};  // a: 5 2 1 4 3

To sort in the mean time of insertion

按平均插入时间排序

std::multimap<int, std::size_t> mm;
for (std::size_t i = 0; i != a.size(); ++i)
    mm.insert({a[i], i});

To retrieve values and original indices

检索值和原始索引

std::vector<int> b;
std::vector<std::size_t> c;
for (const auto & kv : mm) {
    b.push_back(kv.first);             // b: 1 2 3 4 5
    c.push_back(kv.second);            // c: 2 1 4 3 0
}

The reason to prefer a std::multimapto a std::mapis to allow equal values in original vectors. Also please note that, unlike for std::map, operator[]is not defined for std::multimap.

之所以喜欢std::multimapstd::map是允许在原矢量相等的值。另请注意,与 for 不同std::mapoperator[]没有为 定义std::multimap

回答by LxL

Make a std::pairin function then sort pair :

创建一个std::pairin 函数然后对对排序:

generic version :

通用版本:

template< class RandomAccessIterator,class Compare >
auto sort2(RandomAccessIterator begin,RandomAccessIterator end,Compare cmp) ->
   std::vector<std::pair<std::uint32_t,RandomAccessIterator>>
{
    using valueType=typename std::iterator_traits<RandomAccessIterator>::value_type;
    using Pair=std::pair<std::uint32_t,RandomAccessIterator>;

    std::vector<Pair> index_pair;
    index_pair.reserve(std::distance(begin,end));

    for(uint32_t idx=0;begin!=end;++begin,++idx){
        index_pair.push_back(Pair(idx,begin));
    }

    std::sort( index_pair.begin(),index_pair.end(),[&](const Pair& lhs,const Pair& rhs){
          return cmp(*lhs.second,*rhs.second);
    });

    return index_pair;
}

ideone

ideone