C++ 带有键向量的 STL 映射
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/8903737/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
STL Map with a Vector for the Key
提问by jthecie
I'm working with some binary data that I have stored in arbitrarily long arrays of unsigned ints. I've found that I have some duplication of data, and am looking to ignore duplicates in the short term and remove whatever bug is causing them in the long term.
我正在处理一些存储在任意长的无符号整数数组中的二进制数据。我发现我有一些重复的数据,并且希望在短期内忽略重复并从长远来看消除导致它们的任何错误。
I'm looking at inserting each dataset into a map before storing it, but only if it was not found in the map to start with. My initial thought was to have a map of strings and use memcpy as a hammer to force the ints into a character array, and then copy that into a string and store the string. This failed because a good deal of my data contains multiple bytes of 0
(aka NULL
) at the front of the relevant data, so a majority of very real data got thrown out.
我正在考虑在存储之前将每个数据集插入到地图中,但前提是在地图中找不到它。我最初的想法是拥有一个字符串映射并使用 memcpy 作为锤子将整数强制转换为字符数组,然后将其复制到字符串中并存储该字符串。这失败了,因为我的大量数据在相关数据的前面包含多个字节0
(又名NULL
),因此大部分非常真实的数据被丢弃了。
My next attempt is planned to be std::map<std::vector<unsigned char>,int>
, but I'm realizing that I don't know if the map insert function will work.
我的下一次尝试计划是std::map<std::vector<unsigned char>,int>
,但我意识到我不知道地图插入功能是否会起作用。
Is this doable, even if ill advised, or is there a better way to approach this problem?
这是可行的,即使是不明智的,还是有更好的方法来解决这个问题?
Edit
编辑
So it's been remarked that I didn't make clear what I'm doing, so here's a hopefully better description.
所以有人说我没有说清楚我在做什么,所以这里有一个希望更好的描述。
I'm working on generating a minimum spanning tree, given that I have a number of trees containing the actual end nodes I'm working with. The goal is to come up with the selection of trees that has the shortest length and that covers all of the end nodes, where the chosen trees share at most one node with each other and are all connected. I'm basing my approach off of a binary decision tree, but making a few changes to hopefully allow for greater parallelism.
我正在生成最小生成树,因为我有许多包含我正在使用的实际端节点的树。目标是选出长度最短且覆盖所有末端节点的树,其中所选的树最多共享一个节点并且全部连接。我的方法基于二叉决策树,但进行了一些更改以希望允许更大的并行性。
Rather than taking the binary tree approach I've opted to make a bit vector out of unsigned integers for each dataset, where a 1 in a bit position indicates the inclusion of the corresponding tree.
我没有采用二叉树方法,而是选择为每个数据集使用无符号整数制作一个位向量,其中位位置中的 1 表示包含相应的树。
For example if just tree 0 were included in a 5 tree dataset I would start with
例如,如果只有树 0 包含在 5 树数据集中,我将从
00001
00001
From here I can generate:
从这里我可以生成:
00011
00011
00101
00101
01001
01001
10001
10001
Each of these can then be processed in parallel, since none of them depend on each other. I do this for all of the single trees (00010, 00100, etc..) and should, I haven't taken the time to formally prove it, be able to generate all values in the range (0,2^n) once and only once.
然后可以并行处理这些中的每一个,因为它们都不相互依赖。我对所有的单棵树(00010、00100 等)都这样做,并且应该,我还没有花时间正式证明它,能够生成范围 (0,2^n) 内的所有值一次而且只有一次。
I started to notice that many datasets were taking far longer to complete than I thought they should, and enabled a debugging output to look at all of the generated results, and a quick Perl script later it was confirmed that I had multiple processes generating the same output. Since then I've been trying to resolve where the duplicates are coming from with very little success, and I'm hoping that this will work well enough to let me verify the results that are being generated without the, sometimes, 3 day wait on computations.
我开始注意到许多数据集的完成时间比我想象的要长得多,并启用了调试输出来查看所有生成的结果,后来一个快速的 Perl 脚本证实我有多个进程生成相同的结果输出。从那以后,我一直试图解决重复项的来源,但收效甚微,我希望这能很好地工作,让我验证生成的结果,而无需等待 3 天计算。
回答by Renan Greinert
You will not have problems with that, as std::vector provides you the "==", "<" and ">" operators:
你不会有问题,因为 std::vector 为你提供了 "=="、"<" 和 ">" 运算符:
http://en.cppreference.com/w/cpp/container/vector/operator_cmp
http://en.cppreference.com/w/cpp/container/vector/operator_cmp
回答by Jon
The requirements for being a keyin std::map
are satisfied by std::vector
, so yes you can do that. Sounds like a good temporary solution (easy to code, minimum of hassle) -- but you know what they say: "there is nothing more permanent than the temporary".
的,作为一个关键要求在std::map
受满意std::vector
,所以是的,你可以做到这一点。听起来是一个很好的临时解决方案(易于编码,麻烦最少)——但你知道他们怎么说:“没有什么比临时更持久的了”。
回答by Brian Neal
That should work, as Renan Greinert points out, vector<>
meets the requirements to be used as a map
key.
正如 Renan Greinert 指出的那样,这应该vector<>
可以满足用作map
密钥的要求。
You also say:
你还说:
I'm looking at inserting each dataset into a map before storing it, but only if it was not found in the map to start with.
我正在考虑在存储之前将每个数据集插入到地图中,但前提是在地图中找不到它。
That's usually not what you want to do, as that would involve doing a find()
on the map, and if not found, then doing an insert()
operation. Those two operations would essentially have to do a find twice. It is better just to try and insert the items into the map. If the key is already there, the operation will fail by definition. So your code would look like this:
这通常不是您想要做的,因为这将涉及find()
在地图上做一个,如果没有找到,则执行一个insert()
操作。这两个操作基本上必须进行两次查找。最好只是尝试将项目插入到地图中。如果密钥已经存在,则操作将根据定义失败。所以你的代码看起来像这样:
#include <vector>
#include <map>
#include <utility>
// typedefs help a lot to shorten the verbose C++ code
typedef std::map<std::vector<unsigned char>, int> MyMapType;
std::vector<unsigned char> v = ...; // initialize this somehow
std::pair<MyMapType::iterator, bool> result = myMap.insert(std::make_pair(v, 42));
if (result.second)
{
// the insertion worked and result.first points to the newly
// inserted pair
}
else
{
// the insertion failed and result.first points to the pair that
// was already in the map
}
回答by ezdazuzena
Why do you need a std::map
for that? Maybe I miss some point but what about using an std::vector
together with the find
algorithm as examplained here?
你为什么需要一个std::map
?也许我错过了一些点,但是如何将 astd::vector
与此处说明的find
算法一起使用?
This means, that you append your unsigned int
s to the vector and later search for it, e.g.
这意味着,您将unsigned int
s附加到向量中,然后再搜索它,例如
std::vector<unsigned int> collector; // vector that is substituting your std::map
for(unsigned int i=0; i<myInts.size(); ++i) { // myInts are the long ints you have
if(find(collector.begin(), collector.end(), myInts.at(i)==collector.end()) {
collector.push_back(myInts.at(i));
}
}