C++ 元组与结构

Question

提问by Alex Koay

Is there is any difference between using a std::tupleand a data-only struct?

使用 astd::tuple和 data-only有什么区别struct吗？

typedef std::tuple<int, double, bool> foo_t;

struct bar_t {
    int id;
    double value;
    bool dirty;
}

From what I have found online, I found that there are two major differences: the structis more readable, while the tuplehas many generic functions that can be used. Should there be any significant performance difference? Also, is the data layout compatible with each other (interchangeably casted)?

从我在网上查到的，我发现有两个主要区别：一个struct更具可读性，而一个tuple可以使用的通用功能很多。是否应该有任何显着的性能差异？此外，数据布局是否相互兼容（可互换铸造）？

Answer 1

回答by hungptit

We have a similar discussion about tuple and struct and I write some simple benchmarks with the help from one of my colleague to identify the differences in term of performance between tuple and struct. We first start with a default struct and a tuple.

我们对元组和结构进行了类似的讨论，我在一位同事的帮助下编写了一些简单的基准测试，以确定元组和结构之间在性能方面的差异。我们首先从默认结构和元组开始。

struct StructData {
    int X;
    int Y;
    double Cost;
    std::string Label;

    bool operator==(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) == std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }

    bool operator<(const StructData &rhs) {
        return X < rhs.X || (X == rhs.X && (Y < rhs.Y || (Y == rhs.Y && (Cost < rhs.Cost || (Cost == rhs.Cost && Label < rhs.Label)))));
    }
};

using TupleData = std::tuple<int, int, double, std::string>;

We then use Celero to compare the performance of our simple struct and tuple. Below is the benchmark code and performance results collected using gcc-4.9.2 and clang-4.0.0:

然后我们使用 Celero 来比较我们的简单结构和元组的性能。以下是使用 gcc-4.9.2 和 clang-4.0.0 收集的基准代码和性能结果：

std::vector<StructData> test_struct_data(const size_t N) {
    std::vector<StructData> data(N);
    std::transform(data.begin(), data.end(), data.begin(), [N](auto item) {
        std::random_device rd;
        std::mt19937 gen(rd());
        std::uniform_int_distribution<> dis(0, N);
        item.X = dis(gen);
        item.Y = dis(gen);
        item.Cost = item.X * item.Y;
        item.Label = std::to_string(item.Cost);
        return item;
    });
    return data;
}

std::vector<TupleData> test_tuple_data(const std::vector<StructData> &input) {
    std::vector<TupleData> data(input.size());
    std::transform(input.cbegin(), input.cend(), data.begin(),
                   [](auto item) { return std::tie(item.X, item.Y, item.Cost, item.Label); });
    return data;
}

constexpr int NumberOfSamples = 10;
constexpr int NumberOfIterations = 5;
constexpr size_t N = 1000000;
auto const sdata = test_struct_data(N);
auto const tdata = test_tuple_data(sdata);

CELERO_MAIN

BASELINE(Sort, struct, NumberOfSamples, NumberOfIterations) {
    std::vector<StructData> data(sdata.begin(), sdata.end());
    std::sort(data.begin(), data.end());
    // print(data);

}

BENCHMARK(Sort, tuple, NumberOfSamples, NumberOfIterations) {
    std::vector<TupleData> data(tdata.begin(), tdata.end());
    std::sort(data.begin(), data.end());
    // print(data);
}

Performance results collected with clang-4.0.0

使用 clang-4.0.0 收集的性能结果

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    196663.40000 |            5.08 | 
Sort            | tuple           | Null            |              10 |               5 |         0.92471 |    181857.20000 |            5.50 | 
Complete.

And performance results collected using gcc-4.9.2

以及使用 gcc-4.9.2 收集的性能结果

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    219096.00000 |            4.56 | 
Sort            | tuple           | Null            |              10 |               5 |         0.91463 |    200391.80000 |            4.99 | 
Complete.

From the above results we can clearly see that

从上面的结果我们可以清楚地看到

Tuple is faster than a default struct
Binary produce by clang has higher performance that that of gcc. clang-vs-gcc is not the purpose of this discussion so I won't dive into the detail.

元组比默认结构更快
clang 生成的二进制文件的性能比 gcc 更高。clang-vs-gcc 不是本次讨论的目的，所以我不会深入研究细节。

We all know that writing a == or < or > operator for every single struct definition will be a painful and buggy task. Let replace our custom comparator using std::tie and rerun our benchmark.

我们都知道为每个结构定义编写 == 或 < 或 > 运算符将是一项痛苦和错误的任务。让我们使用 std::tie 替换我们的自定义比较器并重新运行我们的基准测试。

bool operator<(const StructData &rhs) {
    return std::tie(X,Y,Cost, Label) < std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
}

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    200508.20000 |            4.99 | 
Sort            | tuple           | Null            |              10 |               5 |         0.90033 |    180523.80000 |            5.54 | 
Complete.

Now we can see that using std::tie makes our code more elegant and it is harder to make mistake, however, we will loose about 1% performance. I will stay with the std::tie solution for now since I also receive a warning about comparing floating point numbers with the customized comparator.

现在我们可以看到使用 std::tie 使我们的代码更优雅，更难出错，但是，我们会损失大约 1% 的性能。我现在将继续使用 std::tie 解决方案，因为我还收到有关将浮点数与自定义比较器进行比较的警告。

Until now we have not has any solution to make our struct code run faster yet. Let take a look at the swap function and rewrite it to see if we can gain any performance:

到目前为止，我们还没有任何解决方案来使我们的结构代码运行得更快。让我们看看交换函数并重写它，看看我们是否可以获得任何性能：

struct StructData {
    int X;
    int Y;
    double Cost;
    std::string Label;

    bool operator==(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) == std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }

    void swap(StructData & other)
    {
        std::swap(X, other.X);
        std::swap(Y, other.Y);
        std::swap(Cost, other.Cost);
        std::swap(Label, other.Label);
    }  

    bool operator<(const StructData &rhs) {
        return std::tie(X,Y,Cost, Label) < std::tie(rhs.X, rhs.Y, rhs.Cost, rhs.Label);
    }
};

Performance results collected using clang-4.0.0

使用 clang-4.0.0 收集的性能结果

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    176308.80000 |            5.67 | 
Sort            | tuple           | Null            |              10 |               5 |         1.02699 |    181067.60000 |            5.52 | 
Complete.

And the performance results collected using gcc-4.9.2

以及使用 gcc-4.9.2 收集的性能结果

Celero
Timer resolution: 0.001000 us
-----------------------------------------------------------------------------------------------------------------------------------------------
     Group      |   Experiment    |   Prob. Space   |     Samples     |   Iterations    |    Baseline     |  us/Iteration   | Iterations/sec  | 
-----------------------------------------------------------------------------------------------------------------------------------------------
Sort            | struct          | Null            |              10 |               5 |         1.00000 |    198844.80000 |            5.03 | 
Sort            | tuple           | Null            |              10 |               5 |         1.00601 |    200039.80000 |            5.00 | 
Complete.

Now our struct is slightly faster than that of a tuple now (around 3% with clang and less than 1% with gcc), however, we do need to write our customized swap function for all of our structs.

现在，我们的结构体比元组略快（使用 clang 大约 3%，使用 gcc 不到 1%），但是，我们确实需要为所有结构体编写自定义交换函数。

Answer 2

回答by wheaties

If you're using several different tuples in your code you can get away with condensing the number of functors you are using. I say this because I've often used the following forms of functors:

如果您在代码中使用了多个不同的元组，则可以通过压缩正在使用的函子数来逃避。我这样说是因为我经常使用以下形式的函子：

template<int N>
struct tuple_less{
    template<typename Tuple>
    bool operator()(const Tuple& aLeft, const Tuple& aRight) const{
        typedef typename boost::tuples::element<N, Tuple>::type value_type;
        BOOST_CONCEPT_REQUIRES((boost::LessThanComparable<value_type>));

        return boost::tuples::get<N>(aLeft) < boost::tuples::get<N>(aRight);
    }
};

This might seem like overkill but for each place within the struct I'd have to make a whole new functor object using a struct but for a tuple, I just change N. Better than that, I can do this for every single tuple as opposed to creating a whole new functor for each struct and for each member variable. If I have N structs with M member variables that NxM functors I would need to create (worse case scenario) that can be condensed down to one little bit of code.

这可能看起来有点矫枉过正，但是对于结构中的每个位置，我都必须使用结构创建一个全新的函子对象，但是对于元组，我只需更改N. 比这更好的是，我可以为每个元组执行此操作，而不是为每个结构体和每个成员变量创建一个全新的函子。如果我有 N 个带有 M 个成员变量的结构，NxM 个函子我需要创建（最坏的情况），可以压缩成一小段代码。

Naturally, if you're going to go with the Tuple way, you're also going to need to create Enums for working with them:

自然地，如果您打算使用元组方式，您还需要创建枚举以使用它们：

typedef boost::tuples::tuple<double,double,double> HymanPot;
enum HymanPotIndex{
    MAX_POT,
    CURRENT_POT,
    MIN_POT
};

and boom, you're code is completely readable:

和繁荣，你的代码是完全可读的：

double guessWhatThisIs = boost::tuples::get<CURRENT_POT>(someHymanPotTuple);

because it describes itself when you want to get the items contained within it.

因为当您想要获取其中包含的项目时，它会描述自己。

Answer 3

回答by NoSenseEtAl

Tuple has built in default(for == and != it compares every element, for <.<=... compares first, if same compares second...) comparators: http://en.cppreference.com/w/cpp/utility/tuple/operator_cmp

元组内置了默认值（对于 == 和 != 它比较每个元素，对于 <.<=... 首先比较，如果相同则比较第二...）比较器：http: //en.cppreference.com/w/ cpp/实用程序/元组/operator_cmp

Answer 4

回答by Khatharr

Well, here's a benchmark that doesn't construct a bunch of tuples inside the struct operator==(). Turns out there's a pretty significant performance impact from using tuple, as one would expect given that there's no performance impact at all from using PODs. (The address resolver finds the value in the instruction pipeline before the logic unit ever even sees it.)

好吧，这里有一个基准测试，它不会在 struct operator==() 内部构造一堆元组。事实证明，使用 tuple 会对性能产生相当大的影响，正如人们所预期的那样，因为使用 POD 根本没有性能影响。（地址解析器在逻辑单元看到它之前找到指令流水线中的值。）

Common results from running this on my machine with VS2015CE using the default 'Release' settings:

在我的机器上使用 VS2015CE 使用默认的“发布”设置运行它的常见结果：

Structs took 0.0814905 seconds.
Tuples took 0.282463 seconds.

Please monkey with it until you're satisfied.

请随意使用它，直到您满意为止。

#include <iostream>
#include <string>
#include <tuple>
#include <vector>
#include <random>
#include <chrono>
#include <algorithm>

class Timer {
public:
  Timer() { reset(); }
  void reset() { start = now(); }

  double getElapsedSeconds() {
    std::chrono::duration<double> seconds = now() - start;
    return seconds.count();
  }

private:
  static std::chrono::time_point<std::chrono::high_resolution_clock> now() {
    return std::chrono::high_resolution_clock::now();
  }

  std::chrono::time_point<std::chrono::high_resolution_clock> start;

};

struct ST {
  int X;
  int Y;
  double Cost;
  std::string Label;

  bool operator==(const ST &rhs) {
    return
      (X == rhs.X) &&
      (Y == rhs.Y) &&
      (Cost == rhs.Cost) &&
      (Label == rhs.Label);
  }

  bool operator<(const ST &rhs) {
    if(X > rhs.X) { return false; }
    if(Y > rhs.Y) { return false; }
    if(Cost > rhs.Cost) { return false; }
    if(Label >= rhs.Label) { return false; }
    return true;
  }
};

using TP = std::tuple<int, int, double, std::string>;

std::pair<std::vector<ST>, std::vector<TP>> generate() {
  std::mt19937 mt(std::random_device{}());
  std::uniform_int_distribution<int> dist;

  constexpr size_t SZ = 1000000;

  std::pair<std::vector<ST>, std::vector<TP>> p;
  auto& s = p.first;
  auto& d = p.second;
  s.reserve(SZ);
  d.reserve(SZ);

  for(size_t i = 0; i < SZ; i++) {
    s.emplace_back();
    auto& sb = s.back();
    sb.X = dist(mt);
    sb.Y = dist(mt);
    sb.Cost = sb.X * sb.Y;
    sb.Label = std::to_string(sb.Cost);

    d.emplace_back(std::tie(sb.X, sb.Y, sb.Cost, sb.Label));
  }

  return p;
}

int main() {
  Timer timer;

  auto p = generate();
  auto& structs = p.first;
  auto& tuples = p.second;

  timer.reset();
  std::sort(structs.begin(), structs.end());
  double stSecs = timer.getElapsedSeconds();

  timer.reset();
  std::sort(tuples.begin(), tuples.end());
  double tpSecs = timer.getElapsedSeconds();

  std::cout << "Structs took " << stSecs << " seconds.\nTuples took " << tpSecs << " seconds.\n";

  std::cin.get();
}

Answer 5

回答by Matthieu M.

As far as the "generic function" go, Boost.Fusion deserves some love... and especially BOOST_FUSION_ADAPT_STRUCT.

就“通用函数”而言，Boost.Fusion 值得一些爱……尤其是BOOST_FUSION_ADAPT_STRUCT。

Ripping from the page: ABRACADBRA

翻页：ABRACADBRA

namespace demo
{
    struct employee
    {
        std::string name;
        int age;
    };
}

// demo::employee is now a Fusion sequence
BOOST_FUSION_ADAPT_STRUCT(
    demo::employee
    (std::string, name)
    (int, age))

This means that all Fusion algorithms are now applicable to the struct demo::employee.

这意味着所有 Fusion 算法现在都适用于 struct demo::employee。

EDIT: Regarding the performance difference or layout compatibility, tuple's layout is implementation defined so not compatible (and thus you should not cast between either representation) and in general I would expect no difference performance-wise (at least in Release) thanks to the inlining of get<N>.

编辑：关于性能差异或布局兼容性，tuple的布局是实现定义的，因此不兼容（因此你不应该在任何一种表示之间进行转换），总的来说，由于的内联get<N>。

Answer 6

回答by orlp

Well, a POD struct can often be (ab)used in low-level contiguous chunk reading and serializing. A tuple might be more optimized in certain situations and support more functions, as you said.

好吧，POD 结构通常可以（ab）用于低级连续块读取和序列化。正如您所说，元组在某些情况下可能会更加优化并支持更多功能。

Use whatever is more appropriate for the situation, there ain't no general preference. I think (but I haven't benchmarked it) that performance differences won't be significant. The data layout is most likely not compatible and implementation specific.

使用更适合情况的任何东西，没有普遍偏好。我认为（但我没有对其进行基准测试）性能差异不会很大。数据布局很可能不兼容且特定于实现。

Answer 7

回答by Useless

Also, is the data layout compatible with each other (interchangeably casted)?

此外，数据布局是否相互兼容（可互换铸造）？

Oddly I can't see a direct response to this part of the question.

奇怪的是，我看不到对这部分问题的直接回应。

The answer is: no. Or at least not reliably, as the layout of the tuple is unspecified.

答案是：没有。或者至少不可靠，因为元组的布局是未指定的。

Firstly, your struct is a Standard Layout Type. The ordering, padding and alignment of the members are well-defined by a combination of the standard and your platform ABI.

首先，您的结构是标准布局类型。成员的排序、填充和对齐由标准和您的平台 ABI 的组合明确定义。

If a tuple was a standard layout type, and we knew the fields were laid out in the order the types are specified, we might have some confidence it would match the struct.

如果元组是标准布局类型，并且我们知道字段按照指定类型的顺序排列，我们可能有信心它会匹配结构。

The tuple is normally implemented using inheritance, in one of two ways: the old Loki/Modern C++ Design recursive style, or the newer variadic style. Neither is a Standard Layout type, because both violate the following conditions:

元组通常使用继承实现，有两种方式之一：旧的 Loki/Modern C++ Design 递归样式，或较新的可变参数样式。两者都不是标准布局类型，因为两者都违反以下条件：

(prior to C++14)
- has no base classes with non-static data members, or
- has no non-static data members in the most derived class and at most one base class with non-static data members
(for C++14 and later)
- Has all non-static data members and bit-fields declared in the same class (either all in the derived or all in some base)

（在 C++14 之前）
- 没有具有非静态数据成员的基类，或
- 在最派生的类中没有非静态数据成员，并且最多有一个具有非静态数据成员的基类
（适用于 C++14 及更高版本）
- 在同一个类中声明了所有非静态数据成员和位域（要么全部在派生类中，要么全部在某个基类中）

since each leaf base class contains a single tuple element (NB. a single-element tuple probably isa standard layout type, albeit not a very useful one). So, we know the standard does not guaranteethe tuple has the same padding or alignment as the struct.

因为每个叶基类都包含一个元组元素（注意，单元素元组可能是标准布局类型，尽管不是很有用）。因此，我们知道标准不保证元组与结构具有相同的填充或对齐。

Additionally, it's worth noting that the older recursive-style tuple will generally lay out the data members in reverse order.

此外，值得注意的是，较旧的递归样式元组通常会以相反的顺序布置数据成员。

Anecdotally, it has sometimes worked in practice for some compilers and combinations of field types in the past (in one case, using recursive tuples, after reversing the field order). It definitely doesn't work reliably (across compilers, versions etc.) now, and was never guaranteed in the first place.

有趣的是，它有时在实践中适用于过去的某些编译器和字段类型的组合（在一种情况下，在反转字段顺序后使用递归元组）。它现在绝对不能可靠地工作（跨编译器、版本等），并且从一开始就无法保证。

Answer 8

回答by gnasher729

Don't worry about speed or layout, that's nano-optimisation, and depends on the compiler, and there's never enough difference to influence your decision.

不要担心速度或布局，这是纳米优化，取决于编译器，并且永远不会有足够的差异来影响您的决定。

You use a struct for things that meaningfully belong together to form a whole.

您将结构用于有意义地属于一起以形成整体的事物。

You use a tuple for things that are together coincidentally. You can use a tuple spontaneously in your code.

对于巧合在一起的事物，您可以使用元组。您可以在代码中自发地使用元组。

Answer 9

回答by Jerry Coffin

There shouldn't be a performance difference (even an insignificant one). At least in the normal case, they will result in the same memory layout. Nonetheless, casting between them probably isn't required to work (though I'd guess there's a pretty fair chance it normally will).

不应该有性能差异（即使是微不足道的）。至少在正常情况下，它们会导致相同的内存布局。尽管如此，它们之间的转换可能不需要工作（尽管我猜它通常会有相当大的机会）。

Answer 10

回答by Tom K

I know it is an old theme, however I am now about to make a decision about part of my project: should I go the tuple-way or struct-way. After reading this thread I have some ideas.

我知道这是一个古老的主题，但是我现在要对我的项目的一部分做出决定：我应该采用元组方式还是结构方式。读完这篇文章后，我有了一些想法。

About the wheaties and the performance test: please note that you can usually use memcpy, memset and similar tricks for structs. This would make the performance MUCH better than for tuples.
I see some advantages in tuples:
- You can use tuples to return a collection of variables from function or method and decrease a number of types you use.
- Based on the fact that tuple has predefined <,==,> operators you can also use tuple as a key in map or hash_map which is much more cost effective that struct where you need to implement these operators.

关于wheaties和性能测试：请注意，您通常可以对结构体使用memcpy、memset和类似的技巧。这将使性能比元组好得多。
我看到元组的一些优点：
- 您可以使用元组从函数或方法中返回一组变量，并减少您使用的类型数量。
- 基于元组已预定义 <,==,> 运算符这一事实，您还可以将元组用作 map 或 hash_map 中的键，这比在其中需要实现这些运算符的结构更具成本效益。

I have searched the web and eventually reached this page: https://arne-mertz.de/2017/03/smelly-pair-tuple/

我在网上搜索并最终到达了这个页面：https: //arne-mertz.de/2017/03/smelly-pair-tuple/

Generally I agree with a final conclusion from above.

总的来说，我同意上面的最终结论。

C++ 元组与结构

提问by Alex Koay

回答by hungptit

回答by wheaties

回答by NoSenseEtAl

回答by Khatharr

回答by Matthieu M.

回答by orlp

回答by Useless

回答by gnasher729

回答by Jerry Coffin

回答by Tom K

相关推荐

最近更新

标签

C++ 元组与结构

提问by Alex Koay

回答by hungptit

回答by wheaties

回答by NoSenseEtAl

回答by Khatharr

回答by Matthieu M.

回答by orlp

回答by Useless

回答by gnasher729

回答by Jerry Coffin

回答by Tom K

相关推荐

C++ std::cin 输入有空格？

如何在 C++ 中获得屏幕分辨率？

C++ 如何删除集合中的对象

C++ CreateFile：直接写入原始磁盘“访问被拒绝” - Vista、Win7

相关推荐

最近更新

标签