C++ 对数字列表及其索引进行排序的最快方法

Question

提问by Vincent

I have a question that could seem very basic, but it is in a context where "every CPU tick counts" (this is a part of a larger algorithm that will be used on supercomputers).

我有一个看起来很基本的问题，但它是在“每个 CPU 滴答计数”的上下文中（这是将在超级计算机上使用的更大算法的一部分）。

The problem is quite simple : what is the fastest way to sort a list of unsigned long long int numbers and their original indexes ? (At the beginning, the unsigned long long int numbers are in a completely random order.)

问题很简单：对 unsigned long long int 数字及其原始索引的列表进行排序的最快方法是什么？（一开始，unsigned long long int 数字是完全随机的。）

Example :
Before
Numbers: 32 91 11 72
Indexes: 0 1 2 3
After
Numbers: 11 32 72 91
Indexes: 2 0 3 1

By "fastest way", I mean : what algorithm to use : std::sort, C qsort, or another sorting algorithm available on the web ? What container to use (C array, std::vector, std::map...) ? How to sort the indexes at the same time (use structures, std::pair, std::map...) ?

我所说的“最快方式”是指：使用什么算法：std::sort、C qsort 或其他网络上可用的排序算法？使用什么容器（C 数组、std::vector、std::map...）？如何同时对索引进行排序（使用结构、std::pair、std::map...）？

How many element to sort ? -> typically 4Go of numbers

要排序多少个元素？-> 通常是 4Go 的数字

Answer 1

回答by Jerry Coffin

The obvious starting point would be a structure with operator<defined for it:

明显的起点是一个operator<为它定义的结构：

struct data { 
    unsigned long long int number;
    size_t index;
};

struct by_number { 
    bool operator()(data const &left, data const &right) { 
        return left.number < right.number;
    }
};

...and an std::vector to hold the data:

...和一个 std::vector 来保存数据：

 std::vector<data> items;

and std::sortto do the sorting:

并std::sort进行排序：

 std::sort(items.begin(), items.end(), by_number());

The simple fact is, that the normal containers (and such) are sufficiently efficient that using them doesn't make your code substantially less efficient. You mightbe able to do better by writing some part in a different way, but you might about as easily do worse. Start from solid and readable, and test -- don't (attempt to) optimize prematurely.

一个简单的事实是，普通容器（等）的效率足够高，使用它们不会使您的代码效率大大降低。通过以不同的方式编写某些部分，您可能会做得更好，但您可能也很容易做得更糟。从可靠和可读的开始，并进行测试——不要（试图）过早地优化。

Edit: of course in C++11, you can use a lambda expression instead:

编辑：当然在 C++11 中，您可以使用 lambda 表达式来代替：

std::sort(items.begin(), items.end(), 
          [](data const &a, data const &b) { return a.number < b.number; });

This is generally a little more convenient to write. Readability depends--for something simple like this, I'd say sort ... by_numberis pretty readable, but that depends (heavily) on the name you give to the comparison operator. The lambda makes the actual sorting criteria easier to find, so you don't need to choose a name carefully for the code to be readable.

这样一般写起来方便一些。可读性取决于——对于像这样简单的东西，我会说它sort ... by_number的可读性很强，但这（在很大程度上）取决于你给比较运算符的名字。lambda 使实际的排序标准更容易找到，因此您无需为代码的可读性仔细选择名称。

Answer 2

回答by dasblinkenlight

std::pairand std::sortfit your requirements ideally: if you put the value into the pair.firstand the index in pair.second, you can simply call a sorton a vector of pairs, like this:

std::pair并std::sort理想地满足您的要求：如果您将值放入中pair.first，并将索引放入中pair.second，则可以简单地sort在pairs的向量上调用 a ，如下所示：

// This is your original data. It does not need to be in a vector
vector<long> orig;
orig.push_back(10);
orig.push_back(3);
orig.push_back(6);
orig.push_back(11);
orig.push_back(2);
orig.push_back(19);
orig.push_back(7);
// This is a vector of {value,index} pairs
vector<pair<long,size_t> > vp;
vp.reserve(orig.size());
for (size_t i = 0 ; i != orig.size() ; i++) {
    vp.push_back(make_pair(orig[i], i));
}
// Sorting will put lower values ahead of larger ones,
// resolving ties using the original index
sort(vp.begin(), vp.end());
for (size_t i = 0 ; i != vp.size() ; i++) {
    cout << vp[i].first << " " << vp[i].second << endl;
}

Answer 3

回答by Mark Ransom

std::sorthas proven to be faster than the old qsortbecause of the lack of indirection and the possibility of inlining critical operations.

std::sortqsort由于缺乏间接性和内联关键操作的可能性，已被证明比旧的更快。

The implementations of std::sortare likely to be highly optimized and hard to beat, but not impossible. If your data is fixed length and short you might find Radix sortto be faster. Timsortis relatively new and has delivered good results for Python.

的实现std::sort可能是高度优化的并且很难被击败，但并非不可能。如果您的数据是固定长度且较短的，您可能会发现基数排序更快。Timsort相对较新，并为 Python 提供了良好的结果。

You might keep the index array separate from the value array, but I think the extra level of indirection will prove to be a speed killer. Better to keep them together in a struct or std::pair.

您可能会将索引数组与值数组分开，但我认为额外的间接级别将证明是一个速度杀手。最好将它们放在一个 struct 或std::pair.

As always with any speed critical application, you must try some actual implementations and compare them to know for sure which is fastest.

与任何速度关键的应用程序一样，您必须尝试一些实际的实现并比较它们以确定哪个最快。

Answer 4

回答by Branko Dimitrijevic

It mightbe worth separating numbers and indexes and then just sorting indexes, like this:

这可能是值得分隔的数字和指标，然后只是排序指标，就像这样：

#include <vector>
#include <algorithm>
#include <iostream>

void PrintElements(const std::vector<unsigned long long>& numbers, const std::vector<size_t>& indexes) {

    std::cout << "\tNumbers:";
    for (auto i = indexes.begin(); i != indexes.end(); ++i)
        std::cout << '\t' << numbers[*i];
    std::cout << std::endl;

    std::cout << "\tIndexes:";
    for (auto i = indexes.begin(); i != indexes.end(); ++i)
        std::cout << '\t' << *i;
    std::cout << std::endl;

}

int main() {

    std::vector<unsigned long long> numbers;
    std::vector<size_t> indexes;

    numbers.reserve(4); // An overkill for this few elements, but important for billions.
    numbers.push_back(32);
    numbers.push_back(91);
    numbers.push_back(11);
    numbers.push_back(72);

    indexes.reserve(numbers.capacity());
    indexes.push_back(0);
    indexes.push_back(1);
    indexes.push_back(2);
    indexes.push_back(3);

    std::cout << "BEFORE:" << std::endl;
    PrintElements(numbers, indexes);

    std::sort(
        indexes.begin(),
        indexes.end(),
        [&numbers](size_t i1, size_t i2) {
            return numbers[i1] < numbers[i2];
        }
    );

    std::cout << "AFTER:" << std::endl;
    PrintElements(numbers, indexes);

    return EXIT_SUCCESS;

}

This prints:

这打印：

BEFORE:
        Numbers:        32      91      11      72
        Indexes:        0       1       2       3
AFTER:
        Numbers:        11      32      72      91
        Indexes:        2       0       3       1

The idea is that the elements being sorted are small and thus fast to move around during the sort. On modern CPUs however, the effects of indirect access to numberson caching could spoil these gains, so I recommend benchmarking on realistic amounts of data before making a final decision to use it.

这个想法是被排序的元素很小，因此在排序过程中移动速度很快。然而，在现代 CPU 上，间接访问numbers缓存的影响可能会破坏这些收益，因此我建议在做出使用它的最终决定之前对实际数据量进行基准测试。

Answer 5

回答by andre

Use std::vectorand std::sort. That should provided the fastest sort method. To Find the original index create a struct.

使用std::vector和std::sort。那应该提供最快的排序方法。要查找原始索引，请创建一个结构。

struct A {
    int num;
    int index;
}

Then make your own compare Predicate for sort that compares the num in the struct.

然后为比较结构中的 num 的排序创建自己的比较谓词。

struct Predicate {
    bool operator()(const A first, const A second) {
        return first.num < second.num;
    }
}

std::sort(vec.begin(), vec.end(), Predicate())

Answer 6

回答by clanmjc

struct SomeValue
{
    unsigned long long val;
    size_t index;
    bool operator<(const SomeValue& rhs)const
    { 
       return val < rhs.val;
    }
}

 #include <algorithm>
 std::vector<SomeValue> somevec;
 //fill it...
 std::sort(somevec.begin(),somevec.end());

Answer 7

回答by btilly

This will be used on supercomputers?

这将用于超级计算机？

In that case you may want to look into parallel sorting algorithms. That will only make sense for sorting large data sets, but the win if you need it is substantial.

在这种情况下，您可能需要研究并行排序算法。这仅对对大型数据集进行排序才有意义，但如果您需要它，那将是巨大的胜利。

Answer 8

回答by Carl

You might find thisto be an interesting read. I would start with STL's sort and only then try and improve on it if I could. I'm not sure if you have access to a C++11 compiler (like gcc4.7) on this super computer, but I would suggest that std::sort with std::futures and std::threads would get you quite a bit of the way there with regard to parallelizing the problem in a maintainable way.

你可能会发现这是一个有趣的阅读。我会从 STL 的排序开始，然后才尝试改进它，如果可以的话。我不确定您是否可以在这台超级计算机上访问 C++11 编译器（如 gcc4.7），但我建议使用 std::futures 和 std::threads 的 std::sort 会让您非常满意关于以可维护的方式并行化问题的一些方法。

Here is another questionthat compares std::sort with qsort.

这是另一个比较 std::sort 与 qsort 的问题。

Finally, there is this articlein Dr. Dobb's that compares the performance of parallel algorithms.

最后，Dobb 博士的这篇文章比较了并行算法的性能。

C++ 对数字列表及其索引进行排序的最快方法

提问by Vincent

回答by Jerry Coffin

回答by dasblinkenlight

回答by Mark Ransom

回答by Branko Dimitrijevic

回答by andre

回答by clanmjc

回答by btilly

回答by Carl

相关推荐

最近更新

标签

C++ 对数字列表及其索引进行排序的最快方法

提问by Vincent

回答by Jerry Coffin

回答by dasblinkenlight

回答by Mark Ransom

回答by Branko Dimitrijevic

回答by andre

回答by clanmjc

回答by btilly

回答by Carl

相关推荐

C++11 中元组有哪些好的用例？

使用 Visual C++ 2008 在调试模式下找不到 MSVCR90D.dll

C++ 如何使用 OpenCV 检测大量白色像素的区域？

并发编程 C++？

相关推荐

最近更新

标签