在 C++ 中返回 std::vector 的有效方法

Question

提问by Morten

How much data is copied, when returning a std::vector in a function and how big an optimization will it be to place the std::vector in free-store (on the heap) and return a pointer instead i.e. is:

复制了多少数据，在函数中返回 std::vector 时，以及将 std::vector 放在空闲存储（在堆上）中并返回一个指针而不是一个指针的优化有多大，即：

std::vector *f()
{
  std::vector *result = new std::vector();
  /*
    Insert elements into result
  */
  return result;
}

more efficient than:

效率高于：

std::vector f()
{
  std::vector result;
  /*
    Insert elements into result
  */
  return result;
}

?

Answer 1

回答by Nawaz

In C++11, this is the preferred way:

在 C++11 中，这是首选方式：

std::vector<X> f();

That is, return by value.

即按值返回。

With C++11, std::vectorhas move-semantics, which means the localvector declared in your function will be movedon return and in some cases even the move can be elided by the compiler.

在 C++11 中，std::vector具有移动语义，这意味着函数中声明的局部向量将在返回时移动，在某些情况下，编译器甚至可以省略移动。

Answer 2

回答by Steve Jessop

You should return by value.

您应该按值返回。

The standard has a specific feature to improve the efficiency of returning by value. It's called "copy elision", and more specifically in this case the "named return value optimization (NRVO)".

该标准具有提高按值返回效率的特定功能。它被称为“复制省略”，在这种情况下更具体地说是“命名返回值优化（NRVO）”。

Compilers don't have to implement it, but then again compilers don't haveto implement function inlining (or perform any optimization at all). But the performance of the standard libraries can be pretty poor if compilers don't optimize, and all serious compilers implement inlining and NRVO (and other optimizations).

编译器没有实现它，但随后又编译器不具备实现内联函数（或执行任何优化）。但是，如果编译器不优化，标准库的性能可能会很差，并且所有严肃的编译器都实现了内联和 NRVO（以及其他优化）。

When NRVO is applied, there will be no copying in the following code:

应用NRVO时，以下代码不会有复制：

std::vector<int> f() {
    std::vector<int> result;
    ... populate the vector ...
    return result;
}

std::vector<int> myvec = f();

But the user might want to do this:

但用户可能想要这样做：

std::vector<int> myvec;
... some time later ...
myvec = f();

Copy elision does not prevent a copy here because it's an assignment rather than an initialization. However, you should stillreturn by value. In C++11, the assignment is optimized by something different, called "move semantics". In C++03, the above code does cause a copy, and although in theoryan optimizer might be able to avoid it, in practice its too difficult. So instead of myvec = f(), in C++03 you should write this:

复制省略不会阻止这里的复制，因为它是赋值而不是初始化。但是，您仍然应该按值返回。在 C++11 中，赋值由不同的东西优化，称为“移动语义”。在 C++03 中，上面的代码确实会导致复制，虽然理论上优化器可以避免它，但实际上它太难了。因此myvec = f()，在 C++03 中，您应该这样写：

std::vector<int> myvec;
... some time later ...
f().swap(myvec);

There is another option, which is to offer a more flexible interface to the user:

还有另一种选择，即为用户提供更灵活的界面：

template <typename OutputIterator> void f(OutputIterator it) {
    ... write elements to the iterator like this ...
    *it++ = 0;
    *it++ = 1;
}

You can then also support the existing vector-based interface on top of that:

然后，您还可以在此基础上支持现有的基于矢量的接口：

std::vector<int> f() {
    std::vector<int> result;
    f(std::back_inserter(result));
    return result;
}

This mightbe less efficient than your existing code, if your existing code uses reserve()in a way more complex than just a fixed amount up front. But if your existing code basically calls push_backon the vector repeatedly, then this template-based code ought to be as good.

如果您现有的代码以比预先固定的数量更复杂的方式使用，这可能比您现有的代码效率低reserve()。但是，如果您现有的代码基本上push_back重复调用向量，那么这个基于模板的代码应该也一样好。

Answer 3

回答by Steve Jessop

It's time I post an answer about RVO, me too...

是时候发布关于RVO的答案了，我也是......

If you return an object by value, the compiler often optimizes this so it doesn't get constructed twice, since it's superfluous to construct it in the function as a temporary and then copy it. This is called return value optimization: the created object will be moved instead of being copied.

如果您按值返回一个对象，编译器通常会优化它，因此它不会被构造两次，因为在函数中构造它作为临时对象然后复制它是多余的。这称为返回值优化：创建的对象将被移动而不是被复制。

Answer 4

回答by taocp

If the compiler supports Named Return Value Optimization (http://msdn.microsoft.com/en-us/library/ms364057(v=vs.80).aspx), you can directly return the vector provide that there is no:

如果编译器支持命名返回值优化 ( http://msdn.microsoft.com/en-us/library/ms364057(v=vs.80).aspx)，则可以直接返回向量，前提是没有：

Different paths returning different named objects
Multiple return paths (even if the same named object is returned on all paths) with EH states introduced.
The named object returned is referenced in an inline asm block.

不同的路径返回不同的命名对象
引入了 EH 状态的多个返回路径（即使在所有路径上都返回了相同的命名对象）。
返回的命名对象在内联 asm 块中被引用。

NRVO optimizes out the redundant copy constructor and destructor calls and thus improves overall performance.

NRVO 优化了冗余的复制构造函数和析构函数调用，从而提高了整体性能。

There should be no real diff in your example.

在您的示例中应该没有真正的差异。

Answer 5

回答by Drew Dormann

A common pre-C++11 idiom is to pass a reference to the object being filled.

一个常见的 C++11 之前的习惯用法是传递对正在填充的对象的引用。

Then there is no copying of the vector.

那么就没有向量的复制。

void f( std::vector & result )
{
  /*
    Insert elements into result
  */
}

Answer 6

回答by Akash Kandpal

vector<string> getseq(char * db_file)

And if you want to print it on main() you should do it in a loop.

如果你想在 main() 上打印它，你应该在循环中进行。

int main() {
     vector<string> str_vec = getseq(argv[1]);
     for(vector<string>::iterator it = str_vec.begin(); it != str_vec.end(); it++) {
         cout << *it << endl;
     }
}

Answer 7

回答by Amruth A

   vector<string> func1() const
   {
      vector<string> parts;
      return vector<string>(parts.begin(),parts.end()) ;
   }

Answer 8

回答by unclesmrgol dragon

As nice as "return by value" might be, it's the kind of code that can lead one into error. Consider the following program:

就像“按值返回”一样好，它是一种可能导致错误的代码。考虑以下程序：

    #include <string>
    #include <vector>
    #include <iostream>
    using namespace std;
    static std::vector<std::string> strings;
    std::vector<std::string> vecFunc(void) { return strings; };
    int main(int argc, char * argv[]){
      // set up the vector of strings to hold however
      // many strings the user provides on the command line
      for(int idx=1; (idx<argc); ++idx){
         strings.push_back(argv[idx]);
      }

      // now, iterate the strings and print them using the vector function
      // as accessor
      for(std::vector<std::string>::interator idx=vecFunc().begin(); (idx!=vecFunc().end()); ++idx){
         cout << "Addr: " << idx->c_str() << std::endl;
         cout << "Val:  " << *idx << std::endl;
      }
    return 0;
    };

Q: What will happen when the above is executed? A: A coredump.
Q: Why didn't the compiler catch the mistake? A: Because the program is syntactically, although not semantically, correct.
Q: What happens if you modify vecFunc() to return a reference? A: The program runs to completion and produces the expected result.
Q: What is the difference? A: The compiler does not have to create and manage anonymous objects. The programmer has instructed the compiler to use exactly one object for the iterator and for endpoint determination, rather than two different objects as the broken example does.

问：执行上述操作后会发生什么？答：核心转储。
问：为什么编译器没有发现错误？A：因为该程序在语法上是正确的，尽管在语义上是正确的。
问：如果修改 vecFunc() 以返回引用会发生什么？A：程序运行完成并产生预期的结果。
问：有什么区别？答：编译器不必创建和管理匿名对象。程序员已指示编译器为迭代器和端点确定只使用一个对象，而不是像损坏的示例那样使用两个不同的对象。

The above erroneous program will indicate no errors even if one uses the GNU g++ reporting options -Wall -Wextra -Weffc++

即使使用 GNU g++ 报告选项 -Wall -Wextra -Weffc++，上述错误程序也将指示没有错误

If you must produce a value, then the following would work in place of calling vecFunc() twice:

如果你必须产生一个值，那么下面的方法可以代替调用 vecFunc() 两次：

   std::vector<std::string> lclvec(vecFunc());
   for(std::vector<std::string>::iterator idx=lclvec.begin(); (idx!=lclvec.end()); ++idx)...

The above also produces no anonymous objects during iteration of the loop, but requires a possible copy operation (which, as some note, might be optimized away under some circumstances. But the reference method guarantees that no copy will be produced. Believing the compiler will perform RVO is no substitute for trying to build the most efficient code you can. If you can moot the need for the compiler to do RVO, you are ahead of the game.

以上在循环迭代期间也不会产生匿名对象，但需要一个可能的复制操作（正如某些人所说，在某些情况下可能会被优化掉。但引用方法保证不会产生任何副本。相信编译器会执行 RVO 并不能替代尝试构建最高效的代码。如果您可以提出编译器执行 RVO 的需要，那么您就领先了。

在 C++ 中返回 std::vector 的有效方法

提问by Morten

回答by Nawaz

回答by Steve Jessop

回答by Steve Jessop

回答by taocp

回答by Drew Dormann

回答by Akash Kandpal

回答by Amruth A

回答by unclesmrgol dragon

相关推荐

最近更新

标签

在 C++ 中返回 std::vector 的有效方法

提问by Morten

回答by Nawaz

回答by Steve Jessop

回答by Steve Jessop

回答by taocp

回答by Drew Dormann

回答by Akash Kandpal

回答by Amruth A

回答by unclesmrgol dragon

相关推荐

C++ fstream seekg()、seekp() 和 write()

C++ 警告：返回对临时的引用

C++：抛出“std::bad_alloc”的实例后调用终止

为什么这个 C++11 std::regex 示例会抛出 regex_error 异常？

相关推荐

最近更新

标签