什么是 catamorphism,它可以在 C# 3.0 中实现吗?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/196294/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
What is a catamorphism and can it be implemented in C# 3.0?
提问by Mark Cidade
I'm trying to learn about catamorphisms and I've read the Wikipedia articleand the first couple posts in the series of the topic for F#on the Inside F#blog.
我正在尝试了解 catamorphisms,并且我已经阅读了 Wikipedia 文章和Inside F#博客上F# 主题系列中的前几篇文章。
I understand that it's a generalization of folds (i.e., mapping a structure of many values to one value, including a list of values to another list). And I gather that the fold-list and fold-tree is a canonical example.
我知道这是折叠的概括(即将许多值的结构映射到一个值,包括将值列表映射到另一个列表)。我认为折叠列表和折叠树是一个典型的例子。
Can this be shown to be done in C#, using LINQ's Aggregate
operator or some other higher-order method?
是否可以使用 LINQ 的Aggregate
运算符或其他一些高阶方法在 C# 中完成此操作?
采纳答案by Brian
LINQ's Aggregate()
is just for IEnumerables
. Catamorphisms in general refer to the pattern of folding for an arbitrary data type. So Aggregate()
is to IEnumerables
what FoldTree
(below) is to Trees
(below); both are catamorphisms for their respective data types.
LINQAggregate()
仅适用于IEnumerables
. Catamorphisms 通常是指任意数据类型的折叠模式。那么Aggregate()
是IEnumerables
什么FoldTree
(下)是Trees
(如下图); 两者都是各自数据类型的 catamorphisms。
I translated some of the code in part 4 of the seriesinto C#. The code is below. Note that the equivalent F# used three less-than characters (for generic type parameter annotations), whereas this C# code uses more than 60. This is evidence why no one writes such code in C# - there are too many type annotations. I present the code in case it helps people who know C# but not F# play with this. But the code is so dense in C#, it's very hard to make sense of.
我将本系列第 4部分中的一些代码翻译成 C#。代码如下。请注意,等效的 F# 使用三个小于字符(用于泛型类型参数注释),而此 C# 代码使用了 60 个以上。这就是为什么没有人在 C# 中编写此类代码的证据 - 类型注释太多。我展示了代码,以防它可以帮助了解 C# 但不了解 F# 的人玩这个。但是 C# 中的代码非常密集,很难理解。
Given the following definition for a binary tree:
给定以下二叉树的定义:
using System;
using System.Collections.Generic;
using System.Windows;
using System.Windows.Controls;
using System.Windows.Input;
using System.Windows.Media;
using System.Windows.Shapes;
class Tree<T> // use null for Leaf
{
public T Data { get; private set; }
public Tree<T> Left { get; private set; }
public Tree<T> Right { get; private set; }
public Tree(T data, Tree<T> left, Tree<T> rright)
{
this.Data = data;
this.Left = left;
this.Right = right;
}
public static Tree<T> Node<T>(T data, Tree<T> left, Tree<T> right)
{
return new Tree<T>(data, left, right);
}
}
One can fold trees and e.g. measure if two trees have different nodes:
可以折叠树,例如测量两棵树是否有不同的节点:
class Tree
{
public static Tree<int> Tree7 =
Node(4, Node(2, Node(1, null, null), Node(3, null, null)),
Node(6, Node(5, null, null), Node(7, null, null)));
public static R XFoldTree<A, R>(Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV, Tree<A> tree)
{
return Loop(nodeF, leafV, tree, x => x);
}
public static R Loop<A, R>(Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV, Tree<A> t, Func<R, R> cont)
{
if (t == null)
return cont(leafV(t));
else
return Loop(nodeF, leafV, t.Left, lacc =>
Loop(nodeF, leafV, t.Right, racc =>
cont(nodeF(t.Data, lacc, racc, t))));
}
public static R FoldTree<A, R>(Func<A, R, R, R> nodeF, R leafV, Tree<A> tree)
{
return XFoldTree((x, l, r, _) => nodeF(x, l, r), _ => leafV, tree);
}
public static Func<Tree<A>, Tree<A>> XNode<A>(A x, Tree<A> l, Tree<A> r)
{
return (Tree<A> t) => x.Equals(t.Data) && l == t.Left && r == t.Right ? t : Node(x, l, r);
}
// DiffTree: Tree<'a> * Tree<'a> -> Tree<'a * bool>
// return second tree with extra bool
// the bool signifies whether the Node "ReferenceEquals" the first tree
public static Tree<KeyValuePair<A, bool>> DiffTree<A>(Tree<A> tree, Tree<A> tree2)
{
return XFoldTree((A x, Func<Tree<A>, Tree<KeyValuePair<A, bool>>> l, Func<Tree<A>, Tree<KeyValuePair<A, bool>>> r, Tree<A> t) => (Tree<A> t2) =>
Node(new KeyValuePair<A, bool>(t2.Data, object.ReferenceEquals(t, t2)),
l(t2.Left), r(t2.Right)),
x => y => null, tree)(tree2);
}
}
In this second example, another tree is reconstructed differently:
在第二个示例中,另一棵树的重建方式不同:
class Example
{
// original version recreates entire tree, yuck
public static Tree<int> Change5to0(Tree<int> tree)
{
return Tree.FoldTree((int x, Tree<int> l, Tree<int> r) => Tree.Node(x == 5 ? 0 : x, l, r), null, tree);
}
// here it is with XFold - same as original, only with Xs
public static Tree<int> XChange5to0(Tree<int> tree)
{
return Tree.XFoldTree((int x, Tree<int> l, Tree<int> r, Tree<int> orig) =>
Tree.XNode(x == 5 ? 0 : x, l, r)(orig), _ => null, tree);
}
}
And in this third example, folding a tree is used for drawing:
在第三个示例中,折叠树用于绘图:
class MyWPFWindow : Window
{
void Draw(Canvas canvas, Tree<KeyValuePair<int, bool>> tree)
{
// assumes canvas is normalized to 1.0 x 1.0
Tree.FoldTree((KeyValuePair<int, bool> kvp, Func<Transform, Transform> l, Func<Transform, Transform> r) => trans =>
{
// current node in top half, centered left-to-right
var tb = new TextBox();
tb.Width = 100.0;
tb.Height = 100.0;
tb.FontSize = 70.0;
// the tree is a "diff tree" where the bool represents
// "ReferenceEquals" differences, so color diffs Red
tb.Foreground = (kvp.Value ? Brushes.Black : Brushes.Red);
tb.HorizontalContentAlignment = HorizontalAlignment.Center;
tb.VerticalContentAlignment = VerticalAlignment.Center;
tb.RenderTransform = AddT(trans, TranslateT(0.25, 0.0, ScaleT(0.005, 0.005, new TransformGroup())));
tb.Text = kvp.Key.ToString();
canvas.Children.Add(tb);
// left child in bottom-left quadrant
l(AddT(trans, TranslateT(0.0, 0.5, ScaleT(0.5, 0.5, new TransformGroup()))));
// right child in bottom-right quadrant
r(AddT(trans, TranslateT(0.5, 0.5, ScaleT(0.5, 0.5, new TransformGroup()))));
return null;
}, _ => null, tree)(new TransformGroup());
}
public MyWPFWindow(Tree<KeyValuePair<int, bool>> tree)
{
var canvas = new Canvas();
canvas.Width=1.0;
canvas.Height=1.0;
canvas.Background = Brushes.Blue;
canvas.LayoutTransform=new ScaleTransform(200.0, 200.0);
Draw(canvas, tree);
this.Content = canvas;
this.Title = "MyWPFWindow";
this.SizeToContent = SizeToContent.WidthAndHeight;
}
TransformGroup AddT(Transform t, TransformGroup tg) { tg.Children.Add(t); return tg; }
TransformGroup ScaleT(double x, double y, TransformGroup tg) { tg.Children.Add(new ScaleTransform(x,y)); return tg; }
TransformGroup TranslateT(double x, double y, TransformGroup tg) { tg.Children.Add(new TranslateTransform(x,y)); return tg; }
[STAThread]
static void Main(string[] args)
{
var app = new Application();
//app.Run(new MyWPFWindow(Tree.DiffTree(Tree.Tree7,Example.Change5to0(Tree.Tree7))));
app.Run(new MyWPFWindow(Tree.DiffTree(Tree.Tree7, Example.XChange5to0(Tree.Tree7))));
}
}
回答by Mark Cidade
I've been doing more reading, including a Micorosft Research paper on functional programming with catamorphisms ("bananas"), and it seems that catamorphismjust refers to any function that takes a list and typically breaks it down to a single value (IEnumerable<A> => B
), like Max()
, Min()
, and in the general case, Aggregate()
, would all be a catamorphisms for lists.
我一直在做更多的阅读,包括一篇关于使用 catamorphisms ("bananas")进行函数式编程的 Micorosft Research 论文,似乎catamorphism只是指任何采用列表并通常将其分解为单个值 ( IEnumerable<A> => B
) 的函数,像Max()
, Min()
,在一般情况下,Aggregate()
, 都是列表的变形。
I was previously under the impression that it refefred to a way of creating a function that can generalize different folds, so that it can fold a tree anda list. There may actually still be such a thing, some kind of functoror arrowmaybe but right now that's beyond my level of understanding.
我之前的印象是它指的是一种创建可以概括不同折叠的函数的方法,以便它可以折叠树和列表。实际上可能仍然存在这样的东西,某种函子或箭头,但现在这超出了我的理解水平。
回答by Thiagarajan
I understand that it's a generalization of folds (i.e., mapping a structure of many values to one value, including a list of values to another list).
我知道这是折叠的概括(即将许多值的结构映射到一个值,包括将值列表映射到另一个列表)。
I wouldn't say one value.It maps it into another structure.
我不会说一个值。它将它映射到另一个结构中。
Maybe an example would clarify.let's say summation over a list.
也许一个例子会澄清。让我们说一个列表的总和。
foldr (\x -> \y -> x + y) 0 [1,2,3,4,5]
foldr (\x -> \y -> x + y) 0 [1,2,3,4,5]
Now this would reduce to 15. But actually,it can be viewed mapping to a purely syntactic structure 1 + 2 + 3 + 4 + 5 + 0. It is just that the programming language(in the above case,haskell) knows how to reduce the above syntactic structure to 15.
现在这将减少到 15。但实际上,它可以被视为映射到一个纯粹的句法结构 1 + 2 + 3 + 4 + 5 + 0。只是编程语言(在上述情况下,haskell)知道如何将上述句法结构减少到 15。
Basically,a catamorphism replaces one data constructor with another one.In case of above list,
基本上,一个 catamorphism 用另一个数据构造函数替换一个数据构造函数。在上述列表的情况下,
[1,2,3,4,5] = 1:2:3:4:5:[] (: is the cons operator,[] is the nil element) the catamorphism above replaced : with + and [] with 0.
[1,2,3,4,5] = 1:2:3:4:5:[] (: 是 cons 操作符,[] 是 nil 元素) 上面的 catamorphism 替换为 : with + and [] with 0 .
It can be generalized to any recursive datatypes.
它可以推广到任何递归数据类型。
回答by Tuomas Hietanen
Brian had great series of posts in his blog. Also Channel9 had a nice video. There is no LINQ syntactic sugar for .Aggregate() so does it matter if it has the definition of LINQ Aggregate method or not? The idea is of course the same. Folding over trees... First we need a Node... maybe Tuple could be used, but this is more clear:
布赖恩在他的博客中有一系列很棒的帖子。Channel9 也有一个不错的视频。.Aggregate() 没有 LINQ 语法糖,所以它是否具有 LINQ Aggregate 方法的定义是否重要?想法当然是一样的。折叠树......首先我们需要一个节点......也许可以使用元组,但这更清楚:
public class Node<TData, TLeft, TRight>
{
public TLeft Left { get; private set; }
public TRight Right { get; private set; }
public TData Data { get; private set; }
public Node(TData x, TLeft l, TRight r){ Data = x; Left = l; Right = r; }
}
Then, in C# we canmake a recursive type, even this is unusual:
然后,在 C# 中我们可以创建一个递归类型,即使这是不寻常的:
public class Tree<T> : Node</* data: */ T, /* left: */ Tree<T>, /* right: */ Tree<T>>
{
// Normal node:
public Tree(T data, Tree<T> left, Tree<T> right): base(data, left, right){}
// No children:
public Tree(T data) : base(data, null, null) { }
}
Now, I will quote some of Brian's code, with slight LINQ-style modifications:
现在,我将引用 Brian 的一些代码,并稍作 LINQ 风格的修改:
- In C# Fold is called Aggregate
- LINQ methods are Extension methods that have the item as first parameter with "this"-keyword.
- Loop can be private
- 在 C# 中折叠被称为聚合
- LINQ 方法是将项目作为第一个参数并带有“this”关键字的扩展方法。
- 循环可以是私有的
...
...
public static class TreeExtensions
{
private static R Loop<A, R>(Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV, Tree<A> t, Func<R, R> cont)
{
if (t == null) return cont(leafV(t));
return Loop(nodeF, leafV, t.Left, lacc =>
Loop(nodeF, leafV, t.Right, racc =>
cont(nodeF(t.Data, lacc, racc, t))));
}
public static R XAggregateTree<A, R>(this Tree<A> tree, Func<A, R, R, Tree<A>, R> nodeF, Func<Tree<A>, R> leafV)
{
return Loop(nodeF, leafV, tree, x => x);
}
public static R Aggregate<A, R>(this Tree<A> tree, Func<A, R, R, R> nodeF, R leafV)
{
return tree.XAggregateTree((x, l, r, _) => nodeF(x, l, r), _ => leafV);
}
}
Now, the usage is quite C#-style:
现在,用法相当 C# 风格:
[TestMethod] // or Console Application:
static void Main(string[] args)
{
// This is our tree:
// 4
// 2 6
// 1 3 5 7
var tree7 = new Tree<int>(4, new Tree<int>(2, new Tree<int>(1), new Tree<int>(3)),
new Tree<int>(6, new Tree<int>(5), new Tree<int>(7)));
var sumTree = tree7.Aggregate((x, l, r) => x + l + r, 0);
Console.WriteLine(sumTree); // 28
Console.ReadLine();
var inOrder = tree7.Aggregate((x, l, r) =>
{
var tmp = new List<int>(l) {x};
tmp.AddRange(r);
return tmp;
}, new List<int>());
inOrder.ForEach(Console.WriteLine); // 1 2 3 4 5 6 7
Console.ReadLine();
var heightTree = tree7.Aggregate((_, l, r) => 1 + (l>r?l:r), 0);
Console.WriteLine(heightTree); // 3
Console.ReadLine();
}
I still like F# more.
我还是更喜欢 F#。
回答by Polymer
Brian's answer in the first paragraph is correct. But his code example doesn't really reflect how one would solve similar problems in a C# style. Consider a simple class node
:
布赖恩在第一段中的回答是正确的。但是他的代码示例并没有真正反映如何以 C# 风格解决类似的问题。考虑一个简单的类node
:
class Node {
public Node Left;
public Node Right;
public int value;
public Node(int v = 0, Node left = null, Node right = null) {
value = v;
Left = left;
Right = right;
}
}
With this we can create a tree in main:
有了这个,我们可以在 main 中创建一棵树:
var Tree =
new Node(4,
new Node(2,
new Node(1),
new Node(3)
),
new Node(6,
new Node(5),
new Node(7)
)
);
We define a generic fold function in Node
's namespace:
我们在Node
的命名空间中定义了一个通用的 fold 函数:
public static R fold<R>(
Func<int, R, R, R> combine,
R leaf_value,
Node tree) {
if (tree == null) return leaf_value;
return
combine(
tree.value,
fold(combine, leaf_value, tree.Left),
fold(combine, leaf_value, tree.Right)
);
}
For catamorphisms we should specify the states of data, Nodes can be null, or have children. The generic parameters determine what we do in either case. Notice the iteration strategy(in this case recursion) is hidden inside the fold function.
对于 catamorphisms 我们应该指定数据的状态,节点可以为空,或者有孩子。通用参数决定了我们在任何一种情况下做什么。注意迭代策略(在这种情况下是递归)隐藏在 fold 函数中。
Now instead of writing:
现在而不是写:
public static int Sum_Tree(Node tree){
if (tree == null) return 0;
var accumulated = tree.value;
accumulated += Sum_Tree(tree.Left);
accumulated += Sum_Tree(tree.Right);
return accumulated;
}
We can write
我们可以写
public static int sum_tree_fold(Node tree) {
return Node.fold(
(x, l, r) => x + l + r,
0,
tree
);
}
Elegant, simple, type checked, maintainable, etc. Easy to use Console.WriteLine(Node.Sum_Tree(Tree));
.
优雅,简单,类型检查,可维护等。易于使用Console.WriteLine(Node.Sum_Tree(Tree));
。
It's easy to add new functionality:
添加新功能很容易:
public static List<int> In_Order_fold(Node tree) {
return Node.fold(
(x, l, r) => {
var tree_list = new List<int>();
tree_list.Add(x);
tree_list.InsertRange(0, l);
tree_list.AddRange(r);
return tree_list;
},
new List<int>(),
tree
);
}
public static int Height_fold(Node tree) {
return Node.fold(
(x, l, r) => 1 + Math.Max(l, r),
0,
tree
);
}
F# wins in the conciseness category for In_Order_fold
but that's to be expected when the language provides dedicated operators for constructing and using lists.
F# 在简洁性类别中获胜,In_Order_fold
但是当该语言提供用于构造和使用列表的专用运算符时,这是可以预料的。
The dramatic difference between C# and F# seems to be due to F#'s use of closures, to act as implicit data structures, for triggering the tail call optimization. The example in Brian's answer also takes in to account optimizations in F#, for dodging reconstructing the tree. I'm not sure C# supports the tail call optimization, and maybe In_Order_fold
could be written better, but neither of these points are relevant when discussing how expressive C# is when dealing with these Catamorphisms.
C# 和 F# 之间的巨大差异似乎是由于 F# 使用闭包作为隐式数据结构来触发尾调用优化。Brian 的回答中的示例还考虑了 F# 中的优化,用于躲避重建树。我不确定 C# 是否支持尾调用优化,也许In_Order_fold
可以写得更好,但是在讨论 C# 在处理这些 Catamorphisms 时的表现力时,这两点都无关紧要。
When translating code between languages, you need to understand the core idea of the technique, and then implement the idea in terms of the language's primitives.
在语言之间翻译代码时,需要理解该技术的核心思想,然后根据语言的原语来实现该思想。
Maybe now you'll be able to convince your C# co-workers to take folds more seriously.
也许现在您将能够说服您的 C# 同事更认真地对待折叠。