C++ 中的惰性求值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/414243/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Lazy evaluation in C++
提问by user51568
C++ does not have native support for lazy evaluation (as Haskell does).
C++ 没有对惰性求值的原生支持(就像 Haskell 那样)。
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
我想知道是否有可能以合理的方式在 C++ 中实现惰性求值。如果是,你会怎么做?
EDIT: I like Konrad Rudolph's answer.
编辑:我喜欢康拉德鲁道夫的回答。
I'm wondering if it's possible to implement it in a more generic fashion, for example by using a parametrized class lazy that essentially works for T the way matrix_add works for matrix.
我想知道是否有可能以更通用的方式实现它,例如通过使用参数化类lazy 基本上适用于T,就像matrix_add 适用于矩阵一样。
Any operation on T would return lazy instead. The only problem is to store the arguments and operation code inside lazy itself. Can anyone see how to improve this?
对 T 的任何操作都会返回惰性。唯一的问题是将参数和操作代码存储在惰性本身中。谁能看到如何改进这一点?
回答by Konrad Rudolph
I'm wondering if it is possible to implement lazy evaluation in C++ in a reasonable manner. If yes, how would you do it?
我想知道是否有可能以合理的方式在 C++ 中实现惰性求值。如果是,你会怎么做?
Yes, this is possible and quite often done, e.g. for matrix calculations. The main mechanism to facilitate this is operator overloading. Consider the case of matrix addition. The signature of the function would usually look something like this:
是的,这是可能的,而且经常这样做,例如用于矩阵计算。促进这一点的主要机制是运算符重载。考虑矩阵加法的情况。函数的签名通常如下所示:
matrix operator +(matrix const& a, matrix const& b);
Now, to make this function lazy, it's enough to return a proxy instead of the actual result:
现在,为了使这个函数变得懒惰,返回一个代理而不是实际结果就足够了:
struct matrix_add;
matrix_add operator +(matrix const& a, matrix const& b) {
return matrix_add(a, b);
}
Now all that needs to be done is to write this proxy:
现在需要做的就是编写这个代理:
struct matrix_add {
matrix_add(matrix const& a, matrix const& b) : a(a), b(b) { }
operator matrix() const {
matrix result;
// Do the addition.
return result;
}
private:
matrix const& a, b;
};
The magic lies in the method operator matrix()
which is an implicit conversion operator from matrix_add
to plain matrix
. This way, you can chain multiple operations (by providing appropriate overloads of course). The evaluation takes place only when the final result is assigned to a matrix
instance.
神奇之处在于方法operator matrix()
是一个隐式转换运算符 from matrix_add
to plain matrix
。这样,您可以链接多个操作(当然,通过提供适当的重载)。仅当最终结果分配给matrix
实例时才会进行评估。
EDITI should have been more explicit. As it is, the code makes no sense because although evaluation happens lazily, it still happens in the same expression. In particular, another addition will evaluate this code unless the matrix_add
structure is changed to allow chained addition. C++0x greatly facilitates this by allowing variadic templates (i.e. template lists of variable length).
编辑我应该更明确。实际上,代码没有意义,因为尽管评估是惰性发生的,但它仍然发生在同一个表达式中。特别是,除非将matrix_add
结构更改为允许链式添加,否则另一个添加将评估此代码。C++0x 通过允许可变参数模板(即可变长度的模板列表)极大地促进了这一点。
However, one very simple case where this code would actually have a real, direct benefit is the following:
但是,此代码实际上具有真正的直接好处的一个非常简单的情况如下:
int value = (A + B)(2, 3);
Here, it is assumed that A
and B
are two-dimensional matrices and that dereferencing is done in Fortran notation, i.e. the above calculates oneelement out of a matrix sum. It's of course wasteful to add the whole matrices. matrix_add
to the rescue:
这里,假设A
和B
是二维矩阵,并且解引用是用 Fortran 表示法完成的,即上面计算矩阵和中的一个元素。将整个矩阵相加当然很浪费。matrix_add
救援:
struct matrix_add {
// …?yadda, yadda, yadda …
int operator ()(unsigned int x, unsigned int y) {
// Calculate *just one* element:
return a(x, y) + b(x, y);
}
};
Other examples abound. I've just remembered that I have implemented something related not long ago. Basically, I had to implement a string class that should adhere to a fixed, pre-defined interface. However, my particular string class dealt with huge strings that weren't actually stored in memory. Usually, the user would just access small substrings from the original string using a function infix
. I overloaded this function for my string type to return a proxy that held a reference to my string, along with the desired start and end position. Only when this substring was actually used did it query a C API to retrieve this portion of the string.
其他例子比比皆是。我才想起不久前我已经实现了一些相关的东西。基本上,我必须实现一个字符串类,该类应该遵循固定的、预定义的接口。但是,我的特定字符串类处理实际上并未存储在内存中的巨大字符串。通常,用户只会使用函数从原始字符串中访问小的子字符串infix
。我为我的字符串类型重载了这个函数,以返回一个代理,其中包含对我的字符串的引用,以及所需的开始和结束位置。只有在实际使用此子字符串时,它才会查询 C API 以检索字符串的这一部分。
回答by j_random_hacker
Boost.Lambda is very nice, but Boost.Protois exactlywhat you are looking for. It already has overloads of allC++ operators, which by default perform their usual function when proto::eval()
is called, but can be changed.
Boost.Lambda是非常好的,但Boost.Proto是正是你所期待的。它已经具有所有C++ 运算符的重载,默认情况下,它们在proto::eval()
被调用时执行其通常的功能,但可以更改。
回答by Johannes Schaub - litb
What Konrad already explained can be put further to support nested invocations of operators, all executed lazily. In Konrad's example, he has an expression object that can store exactly two arguments, for exactly two operands of one operation. The problem is that it will only execute onesubexpression lazily, which nicely explains the concept in lazy evaluation put in simple terms, but doesn't improve performance substantially. The other example shows also well how one can apply operator()
to add only some elements using that expression object. But to evaluate arbitrary complex expressions, we need some mechanism that can storethe structure of that too. We can't get around templates to do that. And the name for that is expression templates
. The idea is that one templated expression object can store the structure of some arbitrary sub-expression recursively, like a tree, where the operations are the nodes, and the operands are the child-nodes. For a verygood explanation i just found today (some days after i wrote the below code) see here.
Konrad 已经解释过的内容可以进一步支持操作符的嵌套调用,所有操作都是惰性执行的。在 Konrad 的示例中,他有一个表达式对象,它可以正好存储两个参数,用于一个操作的两个操作数。问题是它只会懒惰地执行一个子表达式,这很好地解释了懒惰评估中的概念,简单来说,但并没有显着提高性能。另一个示例也很好地展示了如何operator()
使用该表达式对象仅添加一些元素。但是为了评估任意复杂的表达式,我们需要一些机制来存储它的结构。我们不能绕过模板来做到这一点。它的名字是expression templates
. 这个想法是一个模板化的表达式对象可以递归地存储一些任意子表达式的结构,就像一棵树,其中操作是节点,操作数是子节点。对于一个非常好的解释,我今天刚发现的(几天之后,我写了下面的代码)看这里。
template<typename Lhs, typename Rhs>
struct AddOp {
Lhs const& lhs;
Rhs const& rhs;
AddOp(Lhs const& lhs, Rhs const& rhs):lhs(lhs), rhs(rhs) {
// empty body
}
Lhs const& get_lhs() const { return lhs; }
Rhs const& get_rhs() const { return rhs; }
};
That will store any addition operation, even nested one, as can be seen by the following definition of an operator+ for a simple point type:
这将存储任何加法运算,甚至是嵌套的加法运算,如以下简单点类型的 operator+ 定义所示:
struct Point { int x, y; };
// add expression template with point at the right
template<typename Lhs, typename Rhs> AddOp<AddOp<Lhs, Rhs>, Point>
operator+(AddOp<Lhs, Rhs> const& lhs, Point const& p) {
return AddOp<AddOp<Lhs, Rhs>, Point>(lhs, p);
}
// add expression template with point at the left
template<typename Lhs, typename Rhs> AddOp< Point, AddOp<Lhs, Rhs> >
operator+(Point const& p, AddOp<Lhs, Rhs> const& rhs) {
return AddOp< Point, AddOp<Lhs, Rhs> >(p, rhs);
}
// add two points, yield a expression template
AddOp< Point, Point >
operator+(Point const& lhs, Point const& rhs) {
return AddOp<Point, Point>(lhs, rhs);
}
Now, if you have
现在,如果你有
Point p1 = { 1, 2 }, p2 = { 3, 4 }, p3 = { 5, 6 };
p1 + (p2 + p3); // returns AddOp< Point, AddOp<Point, Point> >
You now just need to overload operator= and add a suitable constructor for the Point type and accept AddOp. Change its definition to:
您现在只需要重载 operator= 并为 Point 类型添加合适的构造函数并接受 AddOp。将其定义改为:
struct Point {
int x, y;
Point(int x = 0, int y = 0):x(x), y(y) { }
template<typename Lhs, typename Rhs>
Point(AddOp<Lhs, Rhs> const& op) {
x = op.get_x();
y = op.get_y();
}
template<typename Lhs, typename Rhs>
Point& operator=(AddOp<Lhs, Rhs> const& op) {
x = op.get_x();
y = op.get_y();
return *this;
}
int get_x() const { return x; }
int get_y() const { return y; }
};
And add the appropriate get_x and get_y into AddOp as member functions:
并将适当的 get_x 和 get_y 作为成员函数添加到 AddOp 中:
int get_x() const {
return lhs.get_x() + rhs.get_x();
}
int get_y() const {
return lhs.get_y() + rhs.get_y();
}
Note how we haven't created any temporaries of type Point. It could have been a big matrix with many fields. But at the time the result is needed, we calculate it lazily.
请注意我们没有创建任何 Point 类型的临时对象。它可能是一个包含许多字段的大矩阵。但是在需要结果的时候,我们懒得计算。
回答by Johannes Schaub - litb
回答by spiritsaway
Johannes' answer works.But when it comes to more parentheses ,it doesn't work as wish. Here is an example.
约翰内斯的回答有效。但是当涉及到更多括号时,它并不能如愿。这是一个例子。
Point p1 = { 1, 2 }, p2 = { 3, 4 }, p3 = { 5, 6 }, p4 = { 7, 8 };
(p1 + p2) + (p3+p4)// it works ,but not lazy enough
Because the three overloaded + operator didn't cover the case
因为三个重载的 + 运算符没有覆盖案例
AddOp<Llhs,Lrhs>+AddOp<Rlhs,Rrhs>
So the compiler has to convert either (p1+p2) or(p3+p4) to Point ,that's not lazy enough.And when compiler decides which to convert ,it complains. Because none is better than the other . Here comes my extension: add yet another overloaded operator +
所以编译器必须将 (p1+p2) 或 (p3+p4) 转换为 Point ,这还不够懒惰。当编译器决定转换哪个时,它会抱怨。因为没有一个比另一个更好。这是我的扩展:添加另一个重载运算符 +
template <typename LLhs, typename LRhs, typename RLhs, typename RRhs>
AddOp<AddOp<LLhs, LRhs>, AddOp<RLhs, RRhs>> operator+(const AddOp<LLhs, LRhs> & leftOperandconst, const AddOp<RLhs, RRhs> & rightOperand)
{
return AddOp<AddOp<LLhs, LRhs>, AddOp<RLhs, RRhs>>(leftOperandconst, rightOperand);
}
Now ,the compiler can handle the case above correctly ,and no implicit conversion ,volia!
现在,编译器可以正确处理上述情况,并且没有隐式转换,瞧!
回答by hiapay
I'm thinking about implementing a template class, that uses std::function
. The class should, more or less, look like this:
我正在考虑实现一个模板类,它使用std::function
. 该类应该或多或少如下所示:
template <typename Value>
class Lazy
{
public:
Lazy(std::function<Value()> function) : _function(function), _evaluated(false) {}
Value &operator*() { Evaluate(); return _value; }
Value *operator->() { Evaluate(); return &_value; }
private:
void Evaluate()
{
if (!_evaluated)
{
_value = _function();
_evaluated = true;
}
}
std::function<Value()> _function;
Value _value;
bool _evaluated;
};
For example usage:
例如用法:
class Noisy
{
public:
Noisy(int i = 0) : _i(i)
{
std::cout << "Noisy(" << _i << ")" << std::endl;
}
Noisy(const Noisy &that) : _i(that._i)
{
std::cout << "Noisy(const Noisy &)" << std::endl;
}
~Noisy()
{
std::cout << "~Noisy(" << _i << ")" << std::endl;
}
void MakeNoise()
{
std::cout << "MakeNoise(" << _i << ")" << std::endl;
}
private:
int _i;
};
int main()
{
Lazy<Noisy> n = [] () { return Noisy(10); };
std::cout << "about to make noise" << std::endl;
n->MakeNoise();
(*n).MakeNoise();
auto &nn = *n;
nn.MakeNoise();
}
Above code should produce the following message on the console:
上面的代码应该在控制台上产生以下消息:
Noisy(0)
about to make noise
Noisy(10)
~Noisy(10)
MakeNoise(10)
MakeNoise(10)
MakeNoise(10)
~Noisy(10)
Note that the constructor printing Noisy(10)
will not be called until the variable is accessed.
请注意,在Noisy(10)
访问变量之前不会调用构造函数打印。
This class is far from perfect, though. The first thing would be the default constructor of Value
will have to be called on member initialization (printing Noisy(0)
in this case). We can use pointer for _value
instead, but I'm not sure whether it would affect the performance.
然而,这门课远非完美。第一件事是Value
必须在成员初始化时调用 will的默认构造函数(Noisy(0)
在这种情况下是打印)。我们可以使用指针_value
代替,但我不确定它是否会影响性能。
回答by Hippiehunter
C++0x is nice and all.... but for those of us living in the present you have Boost lambda library and Boost Phoenix. Both with the intent of bringing large amounts of functional programming to C++.
C++0x 很好……但是对于我们这些生活在当下的人来说,你有 Boost lambda 库和 Boost Phoenix。两者都旨在将大量函数式编程引入 C++。
回答by Martin York
Anything is possible.
一切皆有可能。
It depends on exactly what you mean:
这完全取决于您的意思:
class X
{
public: static X& getObjectA()
{
static X instanceA;
return instanceA;
}
};
Here we have the affect of a global variable that is lazily evaluated at the point of first use.
这里我们有一个全局变量的影响,该变量在第一次使用时被延迟评估。
As newly requested in the question.
And stealing Konrad Rudolph design and extending it.
正如问题中新要求的那样。
并窃取 Konrad Rudolph 的设计并对其进行扩展。
The Lazy object:
懒惰的对象:
template<typename O,typename T1,typename T2>
struct Lazy
{
Lazy(T1 const& l,T2 const& r)
:lhs(l),rhs(r) {}
typedef typename O::Result Result;
operator Result() const
{
O op;
return op(lhs,rhs);
}
private:
T1 const& lhs;
T2 const& rhs;
};
How to use it:
如何使用它:
namespace M
{
class Matrix
{
};
struct MatrixAdd
{
typedef Matrix Result;
Result operator()(Matrix const& lhs,Matrix const& rhs) const
{
Result r;
return r;
}
};
struct MatrixSub
{
typedef Matrix Result;
Result operator()(Matrix const& lhs,Matrix const& rhs) const
{
Result r;
return r;
}
};
template<typename T1,typename T2>
Lazy<MatrixAdd,T1,T2> operator+(T1 const& lhs,T2 const& rhs)
{
return Lazy<MatrixAdd,T1,T2>(lhs,rhs);
}
template<typename T1,typename T2>
Lazy<MatrixSub,T1,T2> operator-(T1 const& lhs,T2 const& rhs)
{
return Lazy<MatrixSub,T1,T2>(lhs,rhs);
}
}
回答by user2259659
In C++11 lazy evaluation similar to hiapay's answer can be achieved using std::shared_future. You still have to encapsulate calculations in lambdas but memoization is taken care of:
在 C++11 中,可以使用 std::shared_future 实现类似于 hiapay 答案的惰性求值。您仍然需要将计算封装在 lambdas 中,但需要注意记忆化:
std::shared_future<int> a = std::async(std::launch::deferred, [](){ return 1+1; });
Here's a full example:
这是一个完整的例子:
#include <iostream>
#include <future>
#define LAZY(EXPR, ...) std::async(std::launch::deferred, [__VA_ARGS__](){ std::cout << "evaluating "#EXPR << std::endl; return EXPR; })
int main() {
std::shared_future<int> f1 = LAZY(8);
std::shared_future<int> f2 = LAZY(2);
std::shared_future<int> f3 = LAZY(f1.get() * f2.get(), f1, f2);
std::cout << "f3 = " << f3.get() << std::endl;
std::cout << "f2 = " << f2.get() << std::endl;
std::cout << "f1 = " << f1.get() << std::endl;
return 0;
}
回答by BitTickler
Lets take Haskell as our inspiration - it being lazy to the core. Also, let's keep in mind how Linq in C# uses Enumerators in a monadic (urgh - here is the word - sorry) way. Last not least, lets keep in mind, what coroutines are supposed to provide to programmers. Namely the decoupling of computational steps (e.g. producer consumer) from each other. And lets try to think about how coroutines relate to lazy evaluation.
让我们以 Haskell 作为我们的灵感——它是懒惰的核心。另外,让我们记住 C# 中的 Linq 如何以 monadic(呃 - 这是词 - 抱歉)的方式使用枚举器。最后一点,让我们记住,协程应该为程序员提供什么。即计算步骤(例如生产者消费者)彼此解耦。让我们试着思考协程如何与惰性求值相关联。
All of the above appears to be somehow related.
以上所有似乎都有某种关联。
Next, lets try to extract our personal definition of what "lazy" comes down to.
接下来,让我们尝试提取我们对“懒惰”的个人定义。
One interpretation is: We want to state our computation in a composable way, beforeexecuting it. Some of those parts we use to compose our complete solution might very well draw upon huge (sometimes infinite) data sources, with our full computation also either producing a finite or infinite result.
一种解释是:我们希望在执行之前以可组合的方式声明我们的计算。我们用来组成完整解决方案的某些部分可能会很好地利用巨大的(有时是无限的)数据源,我们的完整计算也会产生有限或无限的结果。
Lets get concrete and into some code. We need an example for that! Here, I choose the fizzbuzz "problem" as an example, just for the reason that there is some nice, lazy solution to it.
让我们具体了解一些代码。我们需要一个例子!在这里,我选择 fizzbuzz “问题”作为例子,只是因为有一些不错的、懒惰的解决方案。
In Haskell, it looks like this:
在 Haskell 中,它看起来像这样:
module FizzBuzz
( fb
)
where
fb n =
fmap merge fizzBuzzAndNumbers
where
fizz = cycle ["","","fizz"]
buzz = cycle ["","","","","buzz"]
fizzBuzz = zipWith (++) fizz buzz
fizzBuzzAndNumbers = zip [1..n] fizzBuzz
merge (x,s) = if length s == 0 then show x else s
The Haskell function cycle
creates an infinite list (lazy, of course!) from a finite list by simply repeating the values in the finite list forever. In an eager programming style, writing something like that would ring alarm bells (memory overflow, endless loops!). But not so in a lazy language. The trick is, that lazy lists are not computed right away. Maybe never. Normally only as much as subsequent code requires it.
Haskell 函数cycle
通过简单地永远重复有限列表中的值,从有限列表创建无限列表(当然是懒惰的!)。在急切的编程风格中,编写类似的东西会敲响警钟(内存溢出,无限循环!)。但在懒惰的语言中并非如此。诀窍是,不会立即计算惰性列表。也许永远不会。通常只有后续代码需要它。
The third line in the where
block above creates another lazy!! list, by means of combining the infinite lists fizz
and buzz
by means of the single two elements recipe "concatenate a string element from either input list into a single string". Again, if this were to be immediately evaluated, we would have to wait for our computer to run out of resources.
where
上面块中的第三行创建了另一个懒惰!!列表,通过组合无限列表fizz
并buzz
通过单个两个元素配方“将任一输入列表中的字符串元素连接成单个字符串”。同样,如果要立即对此进行评估,我们将不得不等待我们的计算机耗尽资源。
In the 4th line, we create tuples of the members of a finite lazy list [1..n]
with our infinite lazy list fizzbuzz
. The result is still lazy.
在第 4 行,我们[1..n]
使用无限惰性列表创建有限惰性列表成员的元组fizzbuzz
。结果还是懒。
Even in the main body of our fb
function, there is no need to get eager. The whole function returns a list with the solution, which itself is -again- lazy. You could as well think of the result of fb 50
as a computation which you can (partially) evaluate later. Or combine with other stuff, leading to an even larger (lazy) evaluation.
即使在我们fb
函数的主体中,也没有必要急于求成。整个函数返回一个包含解决方案的列表,它本身又是懒惰的。您也可以将结果fb 50
视为稍后可以(部分)评估的计算。或者与其他东西结合,导致更大的(懒惰的)评估。
So, in order to get started with our C++ version of "fizzbuzz", we need to think of ways how to combine partial steps of our computation into larger bits of computations, each drawing data from previous steps as required.
因此,为了开始使用我们的 C++ 版本的“fizzbuzz”,我们需要考虑如何将计算的部分步骤组合成更大的计算位,每个计算都根据需要从前面的步骤中提取数据。
You can see the full story in a gist of mine.
你可以在我的要点中看到完整的故事。
Here the basic ideas behind the code:
这里是代码背后的基本思想:
Borrowing from C# and Linq, we "invent" a stateful, generic type Enumerator
, which holds
- The current value of the partial computation
- The state of a partial computation (so we can produce subsequent values)
- The worker function, which produces the next state, the next value and a bool which states if there is more data or if the enumeration has come to an end.
借用 C# 和 Linq,我们“发明”了一个有状态的泛型类型Enumerator
,它包含
- 部分计算的当前值
-部分计算的状态(因此我们可以产生后续值)
- 工作函数,它产生下一个状态,下一个值和一个布尔值,表明是否有更多数据或枚举是否已结束。
In order to be able to compose Enumerator<T,S>
instance by means of the power of the .
(dot), this class also contains functions, borrowed from Haskell type classes such as Functor
and Applicative
.
为了能够Enumerator<T,S>
通过.
(点)的力量组合实例,该类还包含从 Haskell 类型类中借用的函数,例如Functor
和Applicative
。
The worker function for enumerator is always of the form: S -> std::tuple<bool,S,T
where S
is the generic type variable representing the state and T
is the generic type variable representing a value - the result of a computation step.
enumerator 的辅助函数始终采用以下形式:S -> std::tuple<bool,S,T
其中S
是表示状态T
的泛型类型变量,是表示值的泛型类型变量 - 计算步骤的结果。
All this is already visible in the first lines of the Enumerator
class definition.
所有这些已经在Enumerator
类定义的第一行中可见。
template <class T, class S>
class Enumerator
{
public:
typedef typename S State_t;
typedef typename T Value_t;
typedef std::function<
std::tuple<bool, State_t, Value_t>
(const State_t&
)
> Worker_t;
Enumerator(Worker_t worker, State_t s0)
: m_worker(worker)
, m_state(s0)
, m_value{}
{
}
// ...
};
So, all we need to create a specific enumerator instance, we need to create a worker function, have the initial state and create an instance of Enumerator
with those two arguments.
所以,我们需要创建一个特定的枚举器实例,我们需要创建一个工作函数,拥有初始状态并Enumerator
使用这两个参数创建一个实例。
Here an example - function range(first,last)
creates a finite range of values. This corresponds to a lazy list in the Haskell world.
这是一个示例 - 函数range(first,last)
创建一个有限范围的值。这对应于 Haskell 世界中的惰性列表。
template <class T>
Enumerator<T, T> range(const T& first, const T& last)
{
auto finiteRange =
[first, last](const T& state)
{
T v = state;
T s1 = (state < last) ? (state + 1) : state;
bool active = state != s1;
return std::make_tuple(active, s1, v);
};
return Enumerator<T,T>(finiteRange, first);
}
And we can make use of this function, for example like this: auto r1 = range(size_t{1},10);
- We have created ourselves a lazy list with 10 elements!
我们可以使用这个函数,例如像这样:auto r1 = range(size_t{1},10);
- 我们为自己创建了一个包含 10 个元素的惰性列表!
Now, all is missing for our "wow" experience, is to see how we can compose enumerators.
Coming back to Haskells cycle
function, which is kind of cool. How would it look in our C++ world? Here it is:
现在,我们的“哇”体验所缺少的只是看看我们如何组合枚举器。回到 Haskellcycle
函数,这有点酷。在我们的 C++ 世界中它会是什么样子?这里是:
template <class T, class S>
auto
cycle
( Enumerator<T, S> values
) -> Enumerator<T, S>
{
auto eternally =
[values](const S& state) -> std::tuple<bool, S, T>
{
auto[active, s1, v] = values.step(state);
if (active)
{
return std::make_tuple(active, s1, v);
}
else
{
return std::make_tuple(true, values.state(), v);
}
};
return Enumerator<T, S>(eternally, values.state());
}
It takes an enumerator as input and returns an enumerator. Local (lambda) function eternally
simply resets the input enumeration to its start value whenever it runs out of values and voilà - we have an infinite, ever repeating version of the list we gave as an argument:: auto foo = cycle(range(size_t{1},3));
And we can already shamelessly compose our lazy "computations".
它接受一个枚举数作为输入并返回一个枚举数。本地(lambda)函数eternally
只是在输入枚举用完值时简单地将其重置为其起始值,瞧——我们有一个无限的、不断重复的列表版本作为参数:auto foo = cycle(range(size_t{1},3));
我们已经可以无耻地编写我们的懒惰“计算”。
zip
is a good example, showing that we can also create a new enumerator from two input enumerators. The resulting enumerator yields as many values as the smaller of either of the input enumerators (tuples with 2 element, one for each input enumerator). I have implemented zip
inside class Enumerator
itself. Here is how it looks like:
zip
是一个很好的例子,表明我们也可以从两个输入枚举器创建一个新的枚举器。结果枚举器产生的值与任一输入枚举器中的较小者一样多(具有 2 个元素的元组,每个输入枚举器一个)。我已经zip
在内部实现了class Enumerator
。这是它的样子:
// member function of class Enumerator<S,T>
template <class T1, class S1>
auto
zip
( Enumerator<T1, S1> other
) -> Enumerator<std::tuple<T, T1>, std::tuple<S, S1> >
{
auto worker0 = this->m_worker;
auto worker1 = other.worker();
auto combine =
[worker0,worker1](std::tuple<S, S1> state) ->
std::tuple<bool, std::tuple<S, S1>, std::tuple<T, T1> >
{
auto[s0, s1] = state;
auto[active0, newS0, v0] = worker0(s0);
auto[active1, newS1, v1] = worker1(s1);
return std::make_tuple
( active0 && active1
, std::make_tuple(newS0, newS1)
, std::make_tuple(v0, v1)
);
};
return Enumerator<std::tuple<T, T1>, std::tuple<S, S1> >
( combine
, std::make_tuple(m_state, other.state())
);
}
Please note, how the "combining" also ends up in combining the state of both sources and the values of both sources.
请注意,“组合”如何最终结合两个源的状态和两个源的值。
As this post is already TL;DR; for many, here the...
因为这篇文章已经是 TL;DR; 对许多人来说,这里...
Summary
概括
Yes, lazy evaluation can be implemented in C++. Here, I did it by borrowing the function names from haskell and the paradigm from C# enumerators and Linq. There might be similarities to pythons itertools, btw. I think they followed a similar approach.
是的,惰性求值可以在 C++ 中实现。在这里,我借用了 haskell 的函数名以及 C# 枚举器和 Linq 的范式。可能与pythons itertools有相似之处,顺便说一句。我认为他们遵循了类似的方法。
My implementation (see the gist link above) is just a prototype - not production code, btw. So no warranties whatsoever from my side. It serves well as demo code to get the general idea across, though.
我的实现(参见上面的要点链接)只是一个原型 - 不是生产代码,顺便说一句。所以我这边没有任何保证。不过,它可以很好地作为演示代码来了解总体思路。
And what would this answer be without the final C++ version of fizzbuz, eh? Here it is:
如果没有 fizzbuz 的最终 C++ 版本,这个答案会是什么,嗯?这里是:
std::string fizzbuzz(size_t n)
{
typedef std::vector<std::string> SVec;
// merge (x,s) = if length s == 0 then show x else s
auto merge =
[](const std::tuple<size_t, std::string> & value)
-> std::string
{
auto[x, s] = value;
if (s.length() > 0) return s;
else return std::to_string(x);
};
SVec fizzes{ "","","fizz" };
SVec buzzes{ "","","","","buzz" };
return
range(size_t{ 1 }, n)
.zip
( cycle(iterRange(fizzes.cbegin(), fizzes.cend()))
.zipWith
( std::function(concatStrings)
, cycle(iterRange(buzzes.cbegin(), buzzes.cend()))
)
)
.map<std::string>(merge)
.statefulFold<std::ostringstream&>
(
[](std::ostringstream& oss, const std::string& s)
{
if (0 == oss.tellp())
{
oss << s;
}
else
{
oss << "," << s;
}
}
, std::ostringstream()
)
.str();
}
And... to drive the point home even further - here a variation of fizzbuzz which returns an "infinite list" to the caller:
而且……为了进一步说明这一点 - 这里是 fizzbuzz 的一种变体,它向调用者返回一个“无限列表”:
typedef std::vector<std::string> SVec;
static const SVec fizzes{ "","","fizz" };
static const SVec buzzes{ "","","","","buzz" };
auto fizzbuzzInfinite() -> decltype(auto)
{
// merge (x,s) = if length s == 0 then show x else s
auto merge =
[](const std::tuple<size_t, std::string> & value)
-> std::string
{
auto[x, s] = value;
if (s.length() > 0) return s;
else return std::to_string(x);
};
auto result =
range(size_t{ 1 })
.zip
(cycle(iterRange(fizzes.cbegin(), fizzes.cend()))
.zipWith
(std::function(concatStrings)
, cycle(iterRange(buzzes.cbegin(), buzzes.cend()))
)
)
.map<std::string>(merge)
;
return result;
}
It is worth showing, since you can learn from it how to dodge the question what the exact return type of that function is (as it depends on the implementation of the function alone, namely how the code combines the enumerators).
值得展示,因为您可以从中学习如何回避该函数的确切返回类型是什么的问题(因为它仅取决于函数的实现,即代码如何组合枚举器)。
Also it demonstrates that we had to move the vectors fizzes
and buzzes
outside the scope of the function so they are still around when eventually on the outside, the lazy mechanism produces values. If we had not done that, the iterRange(..)
code would have stored iterators to the vectors which are long gone.
此外,它表明,我们只好搬到载体fizzes
和buzzes
外面的功能,因此他们仍然时左右终于在外面,懒惰机制产生值的范围。如果我们没有这样做,iterRange(..)
代码就会将迭代器存储到早已不复存在的向量中。