C++ 跟踪插入顺序的 std::map ？

Question

提问by polyglot

I currently have a std::map<std::string,int>that stores an integer value to an unique string identifier, and I do look up with the string. It does mostly what I want, except for that it does not keep track of the insertion order. So when I iterate the the map to print out the values, they are sorted according to the string; but I want them to be sorted according to the order of (first) insertion.

我目前有一个std::map<std::string,int>将整数值存储到唯一字符串标识符的对象，并且我确实查找了该字符串。它主要做我想要的，除了它不跟踪插入顺序。因此，当我迭代地图以打印出值时，它们会根据字符串进行排序；但我希望它们按照（第一次）插入的顺序进行排序。

I thought about using a vector<pair<string,int>>instead, but I need to look up the string and increment the integer values about 10,000,000 times, so I don't know whether a std::vectorwill be significantly slower.

我想使用 avector<pair<string,int>>代替，但我需要查找字符串并将整数值增加大约 10,000,000 次，所以我不知道 a 是否std::vector会明显变慢。

Is there a way to use std::mapor is there another stdcontainer that better suits my need?

有没有办法使用std::map或有其他std容器更适合我的需要？

[I'm on GCC 3.4, and I have probably no more than 50 pairs of values in my std::map].

[我使用的是 GCC 3.4，我的std::map] 中的值可能不超过 50 对。

Thanks.

谢谢。

Answer 1

回答by Kirill V. Lyadvinsky

If you have only 50 values in std::map you could copy them to std::vector before printing out and sort via std::sort using appropriate functor.

如果 std::map 中只有 50 个值，则可以在打印之前将它们复制到 std::vector 并使用适当的函子通过 std::sort 进行排序。

Or you could use boost::multi_index. It allows to use several indexes. In your case it could look like the following:

或者你可以使用boost::multi_index。它允许使用多个索引。在您的情况下，它可能如下所示：

struct value_t {
      string s;
      int    i;
};
struct string_tag {};
typedef multi_index_container<
    value_t,
    indexed_by<
        random_access<>, // this index represents insertion order
        hashed_unique< tag<string_tag>, member<value_t, string, &value_t::s> >
    >
> values_t;

Answer 2

回答by Michael Kristofik

You might combine a std::vectorwith a std::tr1::unordered_map(a hash table). Here's a link to Boost's documentationfor unordered_map. You can use the vector to keep track of the insertion order and the hash table to do the frequent lookups. If you're doing hundreds of thousands of lookups, the difference between O(log n) lookup for std::mapand O(1) for a hash table might be significant.

您可以将 astd::vector与 a std::tr1::unordered_map（哈希表）结合使用。这里有一个链接Boost的文档进行unordered_map。您可以使用向量来跟踪插入顺序和哈希表来进行频繁查找。如果您要进行数十万次查找，则std::map哈希表的O(log n) 查找和 O(1)查找之间的差异可能很大。

std::vector<std::string> insertOrder;
std::tr1::unordered_map<std::string, long> myTable;

// Initialize the hash table and record insert order.
myTable["foo"] = 0;
insertOrder.push_back("foo");
myTable["bar"] = 0;
insertOrder.push_back("bar");
myTable["baz"] = 0;
insertOrder.push_back("baz");

/* Increment things in myTable 100000 times */

// Print the final results.
for (int i = 0; i < insertOrder.size(); ++i)
{
    const std::string &s = insertOrder[i];
    std::cout << s << ' ' << myTable[s] << '\n';
}

Answer 3

回答by bobobobo

Keep a parallel list<string> insertionOrder.

保持平行list<string> insertionOrder。

When it is time to print, iterate on the listand do lookups into the map.

当需要打印时，对列表进行迭代并在地图中进行查找。

each element in insertionOrder  // walks in insertionOrder..
    print map[ element ].second // but lookup is in map

Answer 4

回答by aggsol

Tessil has a very nice implementaion of ordered map (and set) which is MIT license. You can find it here: ordered-map

Tessil 有一个非常好的有序映射（和集合）实现，这是 MIT 许可。你可以在这里找到它：ordered-map

Map example

地图示例

#include <iostream>
#include <string>
#include <cstdlib>
#include "ordered_map.h"

int main() {
tsl::ordered_map<char, int> map = {{'d', 1}, {'a', 2}, {'g', 3}};
map.insert({'b', 4});
map['h'] = 5;
map['e'] = 6;

map.erase('a');


// {d, 1} {g, 3} {b, 4} {h, 5} {e, 6}
for(const auto& key_value : map) {
    std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}


map.unordered_erase('b');

// Break order: {d, 1} {g, 3} {e, 6} {h, 5}
for(const auto& key_value : map) {
    std::cout << "{" << key_value.first << ", " << key_value.second << "}" << std::endl;
}
}

Answer 5

回答by xtofl

If you need both lookup strategies, you will end up with two containers. You may use a vectorwith your actual values (ints), and put a map< string, vector< T >::difference_type> next to it, returning the index into the vector.

如果您需要两种查找策略，您最终会得到两个容器。您可以将 avector与您的实际值 ( ints) 一起使用，并在其map< string, vector< T >::difference_type> 旁边放置 a ，将索引返回到向量中。

To complete all that, you may encapsulate both in one class.

要完成所有这些，您可以将两者都封装在一个类中。

But I believe boost has a containerwith multiple indices.

但我相信boost 有一个包含多个索引的容器。

Answer 6

回答by CubicleSoft

What you want (without resorting to Boost) is what I call an "ordered hash", which is essentially a mashup of a hash and a linked list with string or integer keys (or both at the same time). An ordered hash maintains the order of the elements during iteration with the absolute performance of a hash.

你想要的（不诉诸Boost）是我所说的“有序散列”，它本质上是散列和带有字符串或整数键（或同时两者）的链表的混搭。有序散列在迭代期间以散列的绝对性能保持元素的顺序。

I've been putting together a relatively new C++ snippet library that fills in what I view as holes in the C++ language for C++ library developers. Go here:

我一直在组合一个相对较新的 C++ 代码片段库，它填补了我认为 C++ 语言中 C++ 库开发人员的漏洞。到这里：

https://github.com/cubiclesoft/cross-platform-cpp

Grab:

抓住：

templates/detachable_ordered_hash.cpp
templates/detachable_ordered_hash.h
templates/detachable_ordered_hash_util.h

If user-controlled data will be placed into the hash, you might also want:

如果将用户控制的数据放入散列中，您可能还需要：

security/security_csprng.cpp
security/security_csprng.h

Invoke it:

调用它：

#include "templates/detachable_ordered_hash.h"
...
// The 47 is the nearest prime to a power of two
// that is close to your data size.
//
// If your brain hurts, just use the lookup table
// in 'detachable_ordered_hash.cpp'.
//
// If you don't care about some minimal memory thrashing,
// just use a value of 3.  It'll auto-resize itself.
int y;
CubicleSoft::OrderedHash<int> TempHash(47);
// If you need a secure hash (many hashes are vulnerable
// to DoS attacks), pass in two randomly selected 64-bit
// integer keys.  Construct with CSPRNG.
// CubicleSoft::OrderedHash<int> TempHash(47, Key1, Key2);
CubicleSoft::OrderedHashNode<int> *Node;
...
// Push() for string keys takes a pointer to the string,
// its length, and the value to store.  The new node is
// pushed onto the end of the linked list and wherever it
// goes in the hash.
y = 80;
TempHash.Push("key1", 5, y++);
TempHash.Push("key22", 6, y++);
TempHash.Push("key3", 5, y++);
// Adding an integer key into the same hash just for kicks.
TempHash.Push(12345, y++);
...
// Finding a node and modifying its value.
Node = TempHash.Find("key1", 5);
Node->Value = y++;
...
Node = TempHash.FirstList();
while (Node != NULL)
{
  if (Node->GetStrKey())  printf("%s => %d\n", Node->GetStrKey(), Node->Value);
  else  printf("%d => %d\n", (int)Node->GetIntKey(), Node->Value);

  Node = Node->NextList();
}

I ran into this SO thread during my research phase to see if anything like OrderedHash already existed without requiring me to drop in a massive library. I was disappointed. So I wrote my own. And now I've shared it.

我在研究阶段遇到了这个 SO 线程，以查看是否已经存在类似 OrderedHash 的东西，而无需我进入一个庞大的库。我很失望。所以我写了我自己的。现在我已经分享了它。

Answer 7

回答by Faisal Vali

You cannot do that with a map, but you could use two separate structures - the map and the vector and keep them synchronized - that is when you delete from the map, find and delete the element from the vector. Or you could create a map<string, pair<int,int>>- and in your pair store the size() of the map upon insertion to record position, along with the value of the int, and then when you print, use the position member to sort.

你不能用地图做到这一点，但你可以使用两个单独的结构 - 地图和向量并保持它们同步 - 即当你从地图中删除，从向量中查找并删除元素时。或者您可以创建一个map<string, pair<int,int>>- 并在您的配对中存储插入时地图的 size() 以记录位置，以及 int 的值，然后在打印时，使用位置成员进行排序。

Answer 8

回答by Polaris878

This is somewhat related to Faisals answer. You can just create a wrapper class around a map and vector and easily keep them synchronized. Proper encapsulation will let you control the access method and hence which container to use... the vector or the map. This avoids using Boost or anything like that.

这与 Faisals 的答案有些相关。您可以围绕地图和矢量创建一个包装类，并轻松地使它们保持同步。适当的封装将让您控制访问方法，从而控制使用哪个容器......向量或地图。这避免了使用 Boost 或类似的东西。

Answer 9

回答by Tom

Another way to implement this is with a mapinstead of a vector. I will show you this approach and discuss the differences:

实现这一点的另一种方法是使用 amap而不是 a vector。我将向您展示这种方法并讨论差异：

Just create a class that has two maps behind the scenes.

只需创建一个在幕后有两张地图的类。

#include <map>
#include <string>

using namespace std;

class SpecialMap {
  // usual stuff...

 private:
  int counter_;
  map<int, string> insertion_order_;
  map<string, int> data_;
};

You can then expose an iterator to iterator over data_in the proper order. The way you do that is iterate through insertion_order_, and for each element you get from that iteration, do a lookup in the data_with the value from insertion_order_

然后，您可以data_按正确的顺序将迭代器公开给迭代器。你这样做的方法是迭代通过insertion_order_，对于你从迭代中获得的每个元素，在中data_使用来自的值进行查找insertion_order_

You can use the more efficient hash_mapfor insertion_order since you don't care about directly iterating through insertion_order_.

您可以将更高效的hash_map用于插入顺序，因为您不关心直接遍历insertion_order_。

To do inserts, you can have a method like this:

要进行插入，您可以使用这样的方法：

void SpecialMap::Insert(const string& key, int value) {
  // This may be an over simplification... You ought to check
  // if you are overwriting a value in data_ so that you can update
  // insertion_order_ accordingly
  insertion_order_[counter_++] = key;
  data_[key] = value;
}

There are a lot of ways you can make the design better and worry about performance, but this is a good skeleton to get you started on implementing this functionality on your own. You can make it templated, and you might actually store pairs as values in data_ so that you can easily reference the entry in insertion_order_. But I leave these design issues as an exercise :-).

有很多方法可以使设计更好，同时担心性能，但这是一个很好的框架，可以帮助您开始自己实现此功能。您可以将其模板化，并且您实际上可以将成对存储为 data_ 中的值，以便您可以轻松地引用 insert_order_ 中的条目。但我将这些设计问题留作练习:-)。

Update: I suppose I should say something about efficiency of using map vs. vector for insertion_order_

更新：我想我应该说一下使用 map 与 vector 进行插入排序的效率

lookups directly into data, in both cases are O(1)
inserts in the vector approach are O(1), inserts in the map approach are O(logn)
deletes in the vector approach are O(n) because you have to scan for the item to remove. With the map approach they are O(logn).

直接查找数据，在这两种情况下都是 O(1)
矢量方法中的插入为 O(1)，映射方法中的插入为 O(logn)
向量方法中的删除是 O(n)，因为您必须扫描要删除的项目。使用地图方法，它们是 O(logn)。

Maybe if you are not going to use deletes as much, you should use the vector approach. The map approach would be better if you were supporting a different ordering (like priority) instead of insertion order.

也许如果您不打算使用删除那么多，您应该使用向量方法。如果您支持不同的排序（如优先级）而不是插入顺序，那么映射方法会更好。

Answer 10

回答by Himanshu Pandey

Here is solution that requires only standard template library without using boost's multiindex:
You could use std::map<std::string,int>;and vector <data>;where in map you store the index of the location of data in vector and vector stores data in insertion order. Here access to data has O(log n) complexity. displaying data in insertion order has O(n) complexity. insertion of data has O(log n) complexity.

这是只需要标准模板库而不使用 boost 的多索引的解决方案：
您可以使用std::map<std::string,int>;和vector <data>;在地图中的何处存储向量中数据位置的索引，向量按插入顺序存储数据。这里对数据的访问具有 O(log n) 复杂度。按插入顺序显示数据的复杂度为 O(n)。数据插入的复杂度为 O(log n)。

For Example:

例如：

#include<iostream>
#include<map>
#include<vector>

struct data{
int value;
std::string s;
}

typedef std::map<std::string,int> MapIndex;//this map stores the index of data stored 
                                           //in VectorData mapped to a string              
typedef std::vector<data> VectorData;//stores the data in insertion order

void display_data_according_insertion_order(VectorData vectorData){
    for(std::vector<data>::iterator it=vectorData.begin();it!=vectorData.end();it++){
        std::cout<<it->value<<it->s<<std::endl;
    }
}
int lookup_string(std::string s,MapIndex mapIndex){
    std::MapIndex::iterator pt=mapIndex.find(s)
    if (pt!=mapIndex.end())return it->second;
    else return -1;//it signifies that key does not exist in map
}
int insert_value(data d,mapIndex,vectorData){
    if(mapIndex.find(d.s)==mapIndex.end()){
        mapIndex.insert(std::make_pair(d.s,vectorData.size()));//as the data is to be
                                                               //inserted at back 
                                                               //therefore index is
                                                               //size of vector before
                                                               //insertion
        vectorData.push_back(d);
        return 1;
    }
    else return 0;//it signifies that insertion of data is failed due to the presence
                  //string in the map and map stores unique keys
}

C++ 跟踪插入顺序的 std::map ？

提问by polyglot

回答by Kirill V. Lyadvinsky

回答by Michael Kristofik

回答by bobobobo

回答by aggsol

回答by xtofl

回答by CubicleSoft

回答by Faisal Vali

回答by Polaris878

回答by Tom

回答by Himanshu Pandey

相关推荐

最近更新

标签

C++ 跟踪插入顺序的 std::map ？

提问by polyglot

回答by Kirill V. Lyadvinsky

回答by Michael Kristofik

回答by bobobobo

回答by aggsol

回答by xtofl

回答by CubicleSoft

回答by Faisal Vali

回答by Polaris878

回答by Tom

回答by Himanshu Pandey

相关推荐

C++ 到 system() 或 fork()/exec()？

C++ 保存结构指针的 STL 列表

C++ 使用 openCV 阅读视频

C/C++ 编译器如何工作？

相关推荐

最近更新

标签