C++ 双向地图是否有更有效的实现?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/21760343/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 23:48:24  来源:igfitidea点击:

Is there a more efficient implementation for a bidirectional map?

c++c++11data-structuresmapbimap

提问by Vittorio Romeo

I created a simple bidirectional mapclass that works by internally storing two std::mapinstances, with opposite key/value types, and providing an user-friendly interface:

我创建了一个简单的双向映射类,它通过内部存储两个std::map具有相反键/值类型的实例来工作,并提供一个用户友好的界面:

template<class T1, class T2> class Bimap
{
    std::map<T1, T2> map1;
    std::map<T2, T1> map2;
    // ...
};
  • Is there a more efficient method of implementing a bidirectional map that doesn't require twice the memory?

  • How is a bimap usually implemented?

  • 有没有更有效的方法来实现不需要两倍内存的双向映射?

  • bimap 通常是如何实现的?



EDIT:

编辑:

  • Should bimap element be mutable or immutable? (Changing one element in map1should change the key in map2, but keys are const and that's impossible - what's the solution?)

  • Ownership of elements is also another problem: when an user inserts a key-value pair in the bimap, the bimap should make a copy of that key-value pair and store it, then the internal second map (with inverted key/value) should not copy but point to the original pair. How can this be achieved?

  • bimap 元素应该是可变的还是不可变的?(更改一个元素 inmap1应该更改 中的键map2,但键是常量,这是不可能的 - 解决方案是什么?)

  • 元素的所有权也是另一个问题:当用户在 bimap 中插入一个键值对时,bimap 应该制作该键值对的副本并存储它,然后内部的第二个映射(带有反转的键/值)应该不是复制而是指向原始对。如何做到这一点?



EDIT 2:

编辑2:

I've posted a possible implementation I made on Code Review.

我已经发布了我在 Code Review 上所做的一个可能的实现。

采纳答案by example

There is a certain problem with double-storing your data in all simple implementations of a bimap. If you can break it down to a bimap of pointers from outside, then you can readily ignore this and simply keep both maps of the form std::map<A*,B*>like Arkaitz Jimenez already suggested (though contrary to his answer you have to care about the storage from outside to avoid a A->A*lookup). But if you have the pointers anyway, why not simply store a std::pair<A,B>at the point where you would otherwise store Aand Bseparately?

在双映射的所有简单实现中双重存储数据存在一定的问题。如果您可以将其分解为来自外部的指针双映射,那么您可以很容易地忽略这一点,只需保留std::map<A*,B*>像 Arkaitz Jimenez 已经建议的形式的两个映射(尽管与他的回答相反,您必须关心外部的存储以避免一个A->A*查找)。但如果你有指针,无论如何,为什么不直接存储std::pair<A,B>在您原本存放点A,并B分别?

It would be nice to have std::map<A,B*>instead of std::map<A*,B*>as this would allow for example the lookup of an element associated to an string by a newly created string with the same content instead of the pointer to the original string that created the pair. But it is customary to store a full copy of the key with every entry and only rely on the hash to find the right bucket. This way the returned item will be the correct one even in the case of a hash-collision...

最好有std::map<A,B*>而不是std::map<A*,B*>因为这将允许例如通过新创建的具有相同内容的字符串而不是指向创建该对的原始字符串的指针来查找与字符串关联的元素。但是习惯上为每个条目存储密钥的完整副本,并且仅依靠哈希来找到正确的存储桶。这样,即使在散列冲突的情况下,返回的项目也将是正确的项目......

If you want to have it quick and dirty though, there is this

如果你想让它又快又脏,有这个

hackish solution:

Create two maps std::map<size_t, A> mapAand std::map<size_t, B> mapB. Upon insertion hash both elements that are to be inserted to get the keys to the respective maps.

void insert(const A &a, const B &b) {
    size_t hashA = std::hash<A>(a);
    size_t hashB = std::hash<B>(b);

    mapA.insert({hashB, a});
    mapB.insert({hashA, b});
}

Lookup is implemented analogously.

黑客解决方案:

创建两个地图std::map<size_t, A> mapAstd::map<size_t, B> mapB. 插入后散列要插入的两个元素以获取相应映射的键。

void insert(const A &a, const B &b) {
    size_t hashA = std::hash<A>(a);
    size_t hashB = std::hash<B>(b);

    mapA.insert({hashB, a});
    mapB.insert({hashA, b});
}

查找的实现方式类似。

Using a multimapinstead of a mapand verifying every element you get with a lookup in the respectively other map (get candidate bfrom mapA, hash band look in mapBif it matches the wanted key, iterate to the next candidate b otherwise) this is a valid implementation - but still hackish in my opinion...

使用 amultimap而不是 amap并通过在相应的其他映射中查找来验证您获得的每个元素(b从 中获取候选mapA,散列b并查看mapB它是否与所需的键匹配,否则迭代到下一个候选 b)这是一个有效的实现 - 但是在我看来仍然是黑客......

You can get a much nicer solution by using the copies of the elements that are used to compare the entries (see above) as only storage. It is a bit harder to get your head around that though. To elaborate:

通过使用用于比较条目(见上文)的元素副本作为唯一存储,您可以获得更好的解决方案。但是,要解决这个问题有点困难。详细说明:

a nicer solution:

Create two sets of pairs as std::set<pair<A, B*>>and std::set<pair<B, A*>>and overload the operator<and operator==to only take the first element of the pairs into account (or provide an corresponding comparion class). It is necessary to create sets of pairs instead of maps (which internally look similarly) because we need a guarantee that Aand Bwill be at constant positions in memory. Upon insertion of an pair<A, B>we split it into two elements that fit into the above sets.

一个更好的解决方案:

创建两组对作为std::set<pair<A, B*>>andstd::set<pair<B, A*>>并重载operator<andoperator==以仅考虑对的第一个元素(或提供相应的比较类)。有必要创建成对集而不是映射(内部看起来类似),因为我们需要保证AB将在内存中处于恒定位置。插入 an 后,pair<A, B>我们将其拆分为适合上述集合的两个元素。

  std::set<pair<B, A*>> mapA;
  std::set<pair<A, B*>> mapB;

  void insert(const A &a, const B &b) {
      auto aitr = mapA.insert({b, nullptr}).first; // creates first pair
      B *bp = &(aitr->first);  // get pointer of our stored copy of b
      auto bitr = mapB.insert({a, bp}).first; 
      // insert second pair {a, pointer_to_b}
      A *ap = &(bitr->first);  // update pointer in mapA to point to a
      aitr->second = ap;
  }

Lookup can now simply be done by a simple std::setlookup and a pointer dereference.

现在可以通过简单的std::set查找和指针取消引用来简单地完成查找。

This nicer solution is similar to the solution that boost uses - even though they use some annonymized pointers as second elements of the pairs and thus have to use reinterpret_casts.

这个更好的解决方案类似于 boost 使用的解决方案 - 即使它们使用一些匿名指针作为对的第二个元素,因此必须使用reinterpret_casts。

Note that the .secondpart of the pairs need to be mutable (so I'm not sure std::paircan be used), or you have to add another layer of abstraction (std::set<pair<B, A**>> mapA) even for this simple insertion. In both solutions you need temporary elements to return non-const references to elements.

请注意,对的.second部分需要是可变的(所以我不确定是否std::pair可以使用),或者std::set<pair<B, A**>> mapA即使对于这个简单的插入,您也必须添加另一个抽象层 ( )。在这两种解决方案中,您都需要临时元素来返回对元素的非常量引用。

回答by Arkaitz Jimenez

It would be more efficient to store all elements in a vector and have 2 maps of <T1*,T2*>and <T2*,T1*>that way you would not have everything copied twice.

这将是更有效地存储在一个向量的所有元素,并有2张地图<T1*,T2*><T2*,T1*>你就不会一切复制两次的方式。

The way I see it you are trying to store 2 things, elements themselves and the relationship between them, if you are aiming to scalar types you could leave it as is 2 maps, but if you aim to treat complex types it makes more sense to separate the storage from the relationships, and handle relationships outside the storage.

在我看来,您正在尝试存储 2 件事,即元素本身以及它们之间的关系,如果您的目标是标量类型,则可以将其保留为 2 个映射,但是如果您的目标是处理复杂类型,则更有意义将存储与关系分开,处理存储之外的关系。

回答by jrok

Boost Bimapmakes use of Boost Mutant Idiom.

Boost Bimap使用了Boost Mutant Idiom

From the linked wikipedia page:

从链接的维基百科页面:

Boost mutant idiom makes use of reinterpret_cast and depends heavily on assumption that the memory layouts of two different structures with identical data members (types and order) are interchangeable. Although the C++ standard does not guarantee this property, virtually all the compilers satisfy it.

Boost 突变习语使用 reinterpret_cast 并在很大程度上依赖于具有相同数据成员(类型和顺序)的两个不同结构的内存布局是可互换的假设。尽管 C++ 标准不保证这个特性,但几乎所有的编译器都满足它。

template <class Pair>
struct Reverse
{
    typedef typename Pair::first_type  second_type;
    typedef typename Pair::second_type first_type;
    second_type second;
    first_type first;
};

template <class Pair>
Reverse<Pair> & mutate(Pair & p)
{
  return reinterpret_cast<Reverse<Pair> &>(p);
}

int main(void)
{
  std::pair<double, int> p(1.34, 5);

  std::cout << "p.first = " << p.first << ", p.second = "  << p.second << std::endl;
  std::cout << "mutate(p).first = " << mutate(p).first << ", mutate(p).second = "  << mutate(p).second << std::endl;
}

The implementation in boost sources is of course fairly hairier.

在 boost 源中的实现当然是相当麻烦的。

回答by Nikos Athanasiou

If you create a set of pairs to your types std::set<std::pair<X,Y>>you pretty much have your functionallity implemented and rules about mutabillity and constness preset(OK maybe the settings aren't what you want but tweaks can be made). So here is the code :

如果你为你的类型创建一组对,std::set<std::pair<X,Y>>你几乎已经实现了你的功能和关于可变性和常量性的规则预设(好吧,也许设置不是你想要的,但可以进行调整)。所以这里是代码:

#ifndef MYBIMAP_HPP
#define MYBIMAP_HPP

#include <set>
#include <utility>
#include <algorithm>

using std::make_pair;

template<typename X, typename Y, typename Xless = std::less<X>, 
                     typename Yless = std::less<Y>>
class bimap
{
    typedef std::pair<X, Y>                             key_type;
    typedef std::pair<X, Y>                             value_type;
    typedef typename std::set<key_type>::iterator       iterator;
    typedef typename std::set<key_type>::const_iterator const_iterator;

    struct Xcomp
    {
        bool operator()(X const &x1, X const &x2)
        {
            return !Xless()(x1, x2) && !Xless()(x2, x1);
        }
    };
    struct Ycomp
    {
        bool operator()(Y const &y1, Y const &y2)
        {
            return !Yless()(y1, y2) && !Yless()(y2, y1);
        }
    };
    struct Fless 
    { // prevents lexicographical comparison for std::pair, so that 
        // every .first value is unique as if it was in its own map
        bool operator()(key_type const &lhs, key_type const &rhs)
        {
            return Xless()(lhs.first, rhs.first);
        }
    };
    /// key and value type are interchangeable
    std::set<std::pair<X, Y>, Fless> _data;

public:
    std::pair<iterator, bool> insert(X const &x, Y const &y)
    {
        auto it = find_right(y);
        if (it == end()) { // every .second value is unique
            return _data.insert(make_pair(x, y));
        }
        return make_pair(it, false);
    }
    iterator find_left(X const &val)
    {
        return _data.find(make_pair(val,Y()));
    }
    iterator find_right(Y const &val)
    {
        return std::find_if(_data.begin(), _data.end(), 
            [&val](key_type const &kt)
        {
            return Ycomp()(kt.second, val);
        });
    }
    iterator end()   { return _data.end();   }
    iterator begin() { return _data.begin(); }
};

#endif

Example usage

示例用法

template<typename X, typename Y, typename In>
void PrintBimapInsertion(X const &x, Y const &y, In const &in)
{
    if (in.second) {
        std::cout << "Inserted element (" 
              << in.first->first << ", " << in.first->second << ")\n";
    }
    else {
        std::cout << "Could not insert (" << x << ", " << y 
                      << ") because (" <<  in.first->first << ", " 
                      << in.first->second << ") already exists\n";
    }
}


int _tmain(int argc, _TCHAR* argv[])
{
    bimap<std::string, int> mb;
    PrintBimapInsertion("A", 1, mb.insert("A", 1) );
    PrintBimapInsertion("A", 2, mb.insert("A", 2) );
    PrintBimapInsertion("b", 2, mb.insert("b", 2));
    PrintBimapInsertion("z", 2, mb.insert("z", 2));

    auto it1 = mb.find_left("A");
    if (it1 != mb.end()) {
        std::cout << std::endl << it1->first << ", " 
                      << it1->second << std::endl;
    }

    auto it2 = mb.find_right(2);
    if (it2 != mb.end()) {
        std::cout << std::endl << it2->first << ", " 
                      << it2->second << std::endl;
    }

    return 0;
}

NOTE: All this is a rough code sketching of what a full implementation would be and even after polishing and extending the code I'm not implying that this would be an alternative to boost::bimapbut merely a homemadeway of having an associative container searchable by both the value and the key.

注意:所有这些都是完整实现的粗略代码草图,即使在完善和扩展代码之后,我也不是在暗示这将是一种替代方法,boost::bimap而只是一种自制的方式,让关联容器可以被两者搜索值和键。

Live example

活生生的例子

回答by divkakwani

One possible implementation that uses an intermediate data structure and an indirection is:

使用中间数据结构和间接的一种可能的实现是:

int sz;  // total elements in the bimap

std::map<A, int> mapA;
std::map<B, int> mapB;

typedef typename std::map<A, int>::iterator iterA;
typedef typename std::map<B, int>::iterator iterB;

std::vector<pair<iterA, iterB>> register;  
// All the operations on bimap are indirected through it.

Insertion

插入

Suppose you have to insert (X, Y) where X, Y are instances of A and B respectively, then:

假设您必须插入 (X, Y) 其中 X, Y 分别是 A 和 B 的实例,则:

  1. Insert (X, sz) in mapA--- O(lg sz)
  2. Insert (Y, sz) in mapB--- O(lg sz)
  3. Then push_back(IterX, IterY) in register--- O(1). Here IterX and IterY are iterators to the corresponding element in mapAand mapBand are obtained from (1) and (2) respectively.
  1. mapA--- O(lg sz) 中插入 (X, sz)
  2. mapB--- O(lg sz) 中插入 (Y, sz)
  3. 然后push_back(IterX, IterY) 在register--- O(1) 中。这里IterX和IterY是迭代器到相应的元件中mapAmapB和分别从(1)和(2)获得的。

Lookup

抬头

Lookup for the image of an element, X, of type A:

查找类型 A 的元素 X 的图像:

  1. Get the int mapped to X in mapA. --- O(lg n)
  2. Use the int to index into registerand get corresponding IterY. --- O(1)
  3. Once you have IterY, you can get Y through IterY->first. --- O(1)
  1. 将 int 映射到 X 中mapA。--- O(lg n)
  2. 使用 int 索引register并获取相应的 IterY。--- O(1)
  3. 一旦你有IterY,您可以通过领Y元IterY->first。--- O(1)

So both the operations are implemented in O(lg n).

所以这两个操作都是在 O(lg n) 中实现的。

Space: All the copies of the objects of A and B are required to be stored only once. There is, however, a lot of bookkeeping stuff. But when you have large objects, that would also be not much significant.

空间:A 和 B 的对象的所有副本只需要存储一次。然而,有很多簿记的东西。但是当你有大对象时,这也没有多大意义。

Note: This implementation relies on the fact that the iterators of a map are never invalidated. Hence, the contents of registerare always valid.

注意:此实现依赖于映射的迭代器永远不会失效的事实。因此, 的内容register始终有效。

A more elaborate version of this implementation can be found here

可以在此处找到此实现的更详细版本

回答by Arun

How about this?

这个怎么样?

Here, we avoid double storage of one type (T1). The other type (T2) is still double stored.

在这里,我们避免了一种类型(T1)的双重存储。另一种类型(T2)仍然是双重存储的。

// Assume T1 is relatively heavier (e.g. string) than T2 (e.g. int family).
// If not, client should instantiate this the other way.
template <typename T1, typename T2>
class UnorderedBimap {

  typedef std::unordered_map<T1, T2> Map1;
  Map1 map1_;

  std::unordered_map<T2, Map1::iterator> map2_;
};