C++ 为地图赋值的最有效方法

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/14218042/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-27 18:07:16  来源:igfitidea点击:

Most efficient way to assign values to maps

c++dictionarystdstdmapc++-standard-library

提问by inquam

Which way to assign values to a map is most efficient? Or are they all optimized to the same code (on most modern compilers)?

哪种给地图赋值的方式最有效?或者它们都针对相同的代码进行了优化(在大多数现代编译器上)?

   // 1) Assignment using array index notation
   Foo["Bar"] = 12345;

   // 2) Assignment using member function insert() and STL pair
   Foo.insert(std::pair<string,int>("Bar", 12345));

   // 3) Assignment using member function insert() and "value_type()"
   Foo.insert(map<string,int>::value_type("Bar", 12345));

   // 4) Assignment using member function insert() and "make_pair()"
   Foo.insert(std::make_pair("Bar", 12345));

(I know I could benchmark and check compiler output, but this question arose now and the only thing I have close at hand is my mobile phone... hehe)

(我知道我可以对编译器输出进行基准测试和检查,但是这个问题现在出现了,我手头唯一的东西就是我的手机......呵呵)

回答by Matthieu M.

First, there are semantic differences between []and insert:

首先,[]和之间存在语义差异insert

  • []will replacethe old value (if any)
  • insertwill ignorethe new value (if an old value is already present)
  • []替换旧值(如果有)
  • insert忽略新值(如果旧值已经存在)

therefore comparing the two is useless in general.

因此,比较两者通常是没有用的。

Regarding the inserts versions:

关于插入版本:

  • std::map<std::string, int>::value_typeisstd::pair<std::string const, int>so no (important) difference between 3 and 4
  • std::make_pair("Bar", 12345)is cheaper than std::pair<std::string, int>("Bar", 12345)because the std::stringtype is a full fledged class with operations to do on copy and "Bar"is just a string literal (thus just a pointer copy); however since at the end you do need to create the std::string... it will depend on the quality of your compiler
  • std::map<std::string, int>::value_typestd::pair<std::string const, int>3和4之间,以便无(重要)差
  • std::make_pair("Bar", 12345)std::pair<std::string, int>("Bar", 12345)因为std::string类型是一个完整的类,在复制时执行操作并且"Bar"只是一个字符串文字(因此只是一个指针副本)而便宜;但是因为最后你确实需要创建std::string......这将取决于你的编译器的质量

In general, I would recommend:

一般来说,我会推荐:

  • []for updating
  • insert(std::make_pair())for ignoring duplicates
  • []用于更新
  • insert(std::make_pair())忽略重复项

std::make_pairis not only shorter, it also respects the DRY guideline: Don't Repeat Yourself.

std::make_pair不仅更短,而且还遵守 DRY 准则:不要重复自己。



For completeness though, the fastest (and easiest) would be emplace(C++11 enabled):

不过,为了完整性,最快(也是最简单的)是emplace(启用 C++11):

map.emplace("Bar", 12345);

Its behavior is that of insert, but it constructs the new element in-place.

它的行为是insert,但它就地构造新元素。

回答by Anders Johansson

1) may be slightly slower than the other methods because std::map::operator[]first default-creates the object if it doesn't already exist, then returns a reference that you can use operator=on to set your desired value, i.e. two operations.

1) 可能比其他方法稍慢,因为std::map::operator[]如果对象不存在,首先默认创建对象,然后返回一个引用,您可以使用operator=它来设置所需的值,即两个操作。

2-4) should be equivalent since map::value_typeis a typedef to std::pairfor the same types, and therefore make_pairis also equivalent. The compiler should treat these identically.

2-4) 应该是等价的,因为map::value_typestd::pair相同类型的 typedef ,因此make_pair也是等价的。编译器应该一视同仁地对待这些。

Also note that performance can be increased further if you need to both check for existence (e.g. to perform special logic depending on whether it exists or not) and then also insert it, by using map::lower_boundto first get a hint to where the element should be, so map::insertdoes not have to search the whole mapagain:

另请注意,如果您需要检查是否存在(例如根据它是否存在执行特殊逻辑)然后插入它,则可以进一步提高性能,方法map::lower_bound是首先使用获取元素应该在哪里的提示,所以map::insert不必map再次搜索整个:

 // get the iterator to where the key *should* be if it existed:
 std::map::iterator hint = mymap.lower_bound(key);

 if (hint == mymap.end() || mymap.key_comp()(key, hint->first)) {
     // key didn't exist in map
     // special logic A here...

     // insert at the correct location
     mymap.insert(hint, make_pair(key, new_value));
 } else { 
     // key exists in map already
     // special logic B here...

     // just update value
     hint->second = new_value;
 }

回答by Jerry Coffin

Your first possibility: Foo["Bar"] = 12345;has somewhat different semantics than the others -- it'll insert a new object if none exists (like the others) but overwritethe current contents if none exists (where the others using insertwill fail if that key is already present).

您的第一种可能性:Foo["Bar"] = 12345;与其他语义略有不同——如果不存在,它将插入一个新对象(与其他对象一样),但如果不存在则覆盖当前内容(insert如果该键已经存在,其他使用的将失败) .

As far as speed goes, it has the potential to be slower than the others. When you're inserting a new object, it has create a pair with the specified key and a default-constructed value_type, then assign the correct value_type afterwards. The others all construct the pair the correct key and value and insert that object. Being fair, however, my experience is that the difference is rarely significant (with older compilers, it was more significant, but with newer ones pretty minimal).

就速度而言,它有可能比其他人慢。当您插入一个新对象时,它会创建一个具有指定键和默认构造的 value_type 的对,然后分配正确的 value_type。其他人都构造了正确的键和值对并插入该对象。然而,公平地说,我的经验是差异很少显着(对于较旧的编译器,它更显着,但对于较新的编译器则非常小)。

The next two are equivalent to each other. You're just using two different ways to name the same type. By run-time, there's no difference between them at all.

接下来的两个是等价的。您只是使用两种不同的方式来命名相同的类型。在运行时,它们之间根本没有区别。

The fourth uses a template function (make_pair) that theoretically couldinvolve an extra level of function call. I'd be quite surprised to see a real difference from this though, except (possibly) if you were careful to ensure that the compiler did absolutely nooptimization (especially inlining) at all.

第四个使用模板函数(make_pair),理论上可能涉及额外的函数调用级别。但是,如果您小心地确保编译器完全没有进行优化(尤其是内联),我会很惊讶地看到与此真正的区别。

Bottom line: The first will often be a little slower than the rest (but not always and not by much). The other three will almost always be equal (as in: normally expect any reasonable compiler to produce identical code for all three) even though there's theoretical justification for the fourth being slower.

底线:第一个通常会比其他的慢一点(但并非总是如此,也不会太多)。其他三个几乎总是相等的(例如:通常期望任何合理的编译器为所有三个生成相同的代码),尽管理论上有理由证明第四个更慢。

回答by Yakk - Adam Nevraumont

If there is no object at that key location, then:

如果该关键位置没有对象,则:

std::map::emplaceis most efficient. insertis second (but will be extremely close). []is least efficient.

std::map::emplace效率最高。 insert是第二个(但会非常接近)。 []效率最低。

[], if there is no object there, trivial constructs one. It then calls operator=.

[], 如果那里没有对象,trivial 构造一个。然后调用operator=.

insertdoes a copy constructor call on the std::pairargument.

insertstd::pair参数进行复制构造函数调用。

However, in the case of maps, map.insert( make_pair( std::move(key), std::move(value) ) )is going to be close to map.emplace( std::move(key), std::move(value) ).

但是,在地图的情况下,将map.insert( make_pair( std::move(key), std::move(value) ) )接近map.emplace( std::move(key), std::move(value) ).

If there is an object at the key location, then []will call operator=, while insert/emplacewill destroy the old object and create a new one. []could easily be cheaper in this case.

如果在关键位置有一个对象,[]则将调用operator=,而insert/emplace将销毁旧对象并创建一个新对象。 []在这种情况下很容易便宜。

In the end, it depends on what your operator=vs copy-construct vs trivial-construct vs destructor costs are for your key and value.

最后,这取决于operator=您的键和值的成本与复制构造、琐碎构造与析构函数的成本。

The actual work looking things up within the tree structure of the std::mapwill be so close to identical it isn't funny.

在 的树结构中查找内容的实际工作std::map非常接近相同,这并不有趣。

回答by Edward A

Even though there has been a couple of good answers already I thought I might as well do a quick benchmark. Ran each one 5 million times and used c++11's chrono to measure the time it took.

尽管已经有几个很好的答案,但我认为我不妨做一个快速基准测试。每个运行 500 万次,并使用 c++11 的 chrono 来测量它所花费的时间。

Heres the code:

代码如下:

#include <string>
#include <map>
#include <chrono>
#include <cstdio>

// 5 million
#define times 5000000

int main()
{
    std::map<std::string, int> foo1, foo2, foo3, foo4, foo5;
    std::chrono::steady_clock::time_point timeStart, timeEnd;
    int x = 0;

    // 1) Assignment using array index notation
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo1[std::to_string(x)] = 12345;
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("1) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    // 2) Assignment using member function insert() and STL pair
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo2.insert(std::pair<std::string, int>(std::to_string(x), 12345));
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("2) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    // 3) Assignment using member function insert() and "value_type()"
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo3.insert(std::map<std::string, int>::value_type(std::to_string(x), 12345));
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("3) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    // 4) Assignment using member function insert() and "make_pair()"
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo4.insert(std::make_pair(std::to_string(x), 12345));
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("4) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    // 5) Matthieu M.'s suggestion of C++11's emplace
    timeStart = std::chrono::steady_clock::now();
    for (x = 0; x <= times; x++)
    {
        foo5.emplace(std::to_string(x), 12345);
    }
    timeEnd = std::chrono::steady_clock::now();
    printf("5) took %i milliseconds\n", (unsigned long long)std::chrono::duration_cast<std::chrono::milliseconds>(timeEnd-timeStart).count());

    return 0;
}

The output for 5 million iterations is:

500 万次迭代的输出是:

1) took 23448 milliseconds
2) took 22854 milliseconds
3) took 22372 milliseconds
4) took 22988 milliseconds
5) took 21356 milliseconds

GCC version:

海湾合作委员会版本:

g++ (Built by MinGW-builds project) 4.8.0 20121225 (experimental)

My machine:

我的机器:

Intel i5-3570k overclocked at 4.6 GHz

EDIT1: Fixed the code and redid the benchmark.

EDIT1:修复了代码并重新进行了基准测试。

EDIT2: Added Matthieu M.'s suggestion for C++11's emplace and he is right, emplace is fastest

EDIT2:添加了 Matthieu M. 对 C++11 的 emplace 的建议,他是对的,emplace 是最快的

回答by PaperBirdMaster

The third one is the best choice (IMHO), but 2, 3 and 4 are equal.

第三个是最佳选择(恕我直言),但 2、3 和 4 是相等的。

// 3) Assignment using member function insert() and "value_type()"
Foo.insert(map<string,int>::value_type("Bar", 12345));

Why I think the third one is the best choice: You're performing only oneoperation to insert the value: just inserting (well, there's a search too) and you can know if the value was inserted, checking the secondmember of the return value and the implementation grants to not overwrite the value.

为什么我认为第三个是最好的选择:你执行一个操作来插入值:只是插入(好吧,也有搜索)你可以知道值是否被插入,检查second返回值的成员并且实施授予不覆盖该值。

The use of the value_typehas advantages too: you don't need to know the mapped type or key type so is useful with template programming.

的使用value_type也有优点:您不需要知道映射类型或键类型,因此对模板编程很有用。

The worst (IMHO) is the first one:

最糟糕的(恕我直言)是第一个:

// 1) Assignment using array index notation
Foo["Bar"] = 12345;

You're calling the std::map::operator[]wich creates an object and returns a reference to it, then the mapped object operator =is called. You're doing two operations for the insertion: first the insertion, second the asignation.

您正在调用std::map::operator[]wich 创建一个对象并返回对它的引用,然后operator =调用映射的对象。您正在为插入执行两个操作:首先是插入,其次是指定。

And it have another problem: you don't know if the value has been inserted or overwrited.

它还有另一个问题:您不知道该值是否已被插入或覆盖。