表达评估和使用多态性的树木漫步? (ala Steve Yegge)

时间:2020-03-05 18:39:55  来源:igfitidea点击:

今天早上,我在读史蒂夫·叶格的《当多态性失败时》,当时我遇到一个问题,他的同事曾经问过潜在的员工,当他们来亚马逊进行面试时。

As an example of polymorphism in
  action, let's look at the classic
  "eval" interview question, which (as
  far as I know) was brought to Amazon
  by Ron Braunstein. The question is
  quite a rich one, as it manages to
  probe a wide variety of important
  skills: OOP design, recursion, binary
  trees, polymorphism and runtime
  typing, general coding skills, and (if
  you want to make it extra hard)
  parsing theory.
  
  At some point, the candidate hopefully
  realizes that you can represent an
  arithmetic expression as a binary
  tree, assuming you're only using
  binary operators such as "+", "-",
  "*", "/". The leaf nodes are all
  numbers, and the internal nodes are
  all operators. Evaluating the
  expression means walking the tree. If
  the candidate doesn't realize this,
  you can gently lead them to it, or if
  necessary, just tell them.
  
  Even if you tell them, it's still an
  interesting problem.
  
  The first half of the question, which
  some people (whose names I will
  protect to my dying breath, but their
  initials are Willie Lewis) feel is a
  Job Requirement If You Want To Call
  Yourself A Developer And Work At
  Amazon, is actually kinda hard. The
  question is: how do you go from an
  arithmetic expression (e.g. in a
  string) such as "2 + (2)" to an
  expression tree. We may have an ADJ
  challenge on this question at some
  point.
  
  The second half is: let's say this is
  a 2-person project, and your partner,
  who we'll call "Willie", is
  responsible for transforming the
  string expression into a tree. You get
  the easy part: you need to decide what
  classes Willie is to construct the
  tree with. You can do it in any
  language, but make sure you pick one,
  or Willie will hand you assembly
  language. If he's feeling ornery, it
  will be for a processor that is no
  longer manufactured in production.
  
  You'd be amazed at how many candidates
  boff this one.
  
  I won't give away the answer, but a
  Standard Bad Solution involves the use
  of a switch or case statment (or just
  good old-fashioned cascaded-ifs). A
  Slightly Better Solution involves
  using a table of function pointers,
  and the Probably Best Solution
  involves using polymorphism. I
  encourage you to work through it
  sometime. Fun stuff!

因此,让我们尝试以三种方式解决该问题。我们如何使用级联if,函数指针表和/或者多态性从算术表达式(例如,字符串中)(例如" 2 +(2)")转换为表达式树?

随意解决一个,两个或者全部三个问题。

[更新:标题经过修改,可以更好地匹配大多数答案。]

解决方案

回答

String Tokenizer + LL(1)解析器将为我们提供一个表达式树...多态方法可能涉及一个带有" evaluate(a,b)"功能的抽象算术类,该类将为所涉及的每个运算符覆盖(加法,减去等)以返回适当的值,并且树包含Integers和Arithmetic运算符,可以通过对树进行post(?)遍历来对其求值。

回答

应该使用imo功能语言。树很难用OO语言表示和操纵。

回答

Or maybe this is the real question:
  how can you represent (2) as a BST?
  That is the part that is tripping me
  up.

递归。

回答

The problem, I think, is that we need to parse perentheses, and yet they are not a binary operator? Should we take (2) as a single token, that evaluates to 2?

括号不需要出现在表达式树中,但是会影响其形状。例如,(1 + 2)+3的树不同于1+(2 + 3):

+
   / \
  +   3
 / \
1   2

相对

+
   / \
  1   +
     / \
    2   3

括号是解析器的"提示"(例如,对于每个superjoe30,"递归地下降")

回答

回复:贾斯汀

我认为这棵树看起来像这样:

+
 / \
2  ( )
    |
    2

基本上,我们将有一个"评估"节点,该节点仅评估其下方的树。然后可以将其优化为:

+
 / \
2   2

在这种情况下,不需要括号并且不添加任何内容。他们在逻辑上没有添加任何内容,因此就走了。

回答

这进入了解析/编译器理论,这有点像兔子洞...《龙书》是编译器构造的标准文字,并将其运用于极端。在这种特殊情况下,我们想为基本算术构造一个无上下文语法,然后使用该语法解析出抽象语法树。然后,我们可以遍历树,从下至上减少它(这时我们将应用多态性/函数指针/ switch语句来减少树)。

我发现这些说明在编译器和解析理论方面非常有用。

回答

多态树行走,Python版本

#!/usr/bin/python

class Node:
    """base class, you should not process one of these"""
    def process(self):
        raise('you should not be processing a node')

class BinaryNode(Node):
    """base class for binary nodes"""
    def __init__(self, _left, _right):
        self.left = _left
        self.right = _right
    def process(self):
        raise('you should not be processing a binarynode')

class Plus(BinaryNode):
    def process(self):
        return self.left.process() + self.right.process()

class Minus(BinaryNode):
    def process(self):
        return self.left.process() - self.right.process()

class Mul(BinaryNode):
    def process(self):
        return self.left.process() * self.right.process()

class Div(BinaryNode):
    def process(self):
        return self.left.process() / self.right.process()

class Num(Node):
    def __init__(self, _value):
        self.value = _value
    def process(self):
        return self.value

def demo(n):
    print n.process()

demo(Num(2))                                       # 2
demo(Plus(Num(2),Num(5)))                          # 2 + 3
demo(Plus(Mul(Num(2),Num(3)),Div(Num(10),Num(5)))) # (2 * 3) + (10 / 2)

测试只是通过使用构造函数来构建二叉树。

程序结构:

抽象基类:Node

  • 所有节点都从此类继承

抽象基类:BinaryNode

  • 所有二进制运算符都从此类继承
  • 处理方法负责评估表达式并返回结果

二进制运算符类:Plus,Minus,Mul,Div

  • 两个子节点,每个子节点分别用于左侧和右侧子表达式

编号等级:Num

  • 拥有叶节点的数值,例如17或者42

回答

@贾斯汀:

看看我关于表示节点的注释。如果我们使用该方案,那么

2 + (2)

可以表示为

.
          / \
         2  ( )
             |
             2

回答

表示节点

如果要包含括号,则需要5种节点:

  • 二进制节点:添加减号Mul Div这些有两个子节点,左侧和右侧
+
    / \
node   node
  • 一个拥有值的节点:Val没有子节点,只是一个数值
  • 一个节点以保持对路径的了解:对子表达式进行单个子节点的对等
( )
     |
    node

对于多态解决方案,我们需要具有这种类关系:

  • 节点
  • BinaryNode:从Node继承
  • 加:从二进制节点继承
  • 减号:从二进制节点继承
  • Mul:从Binary Node继承
  • Div:从Binary Node继承
  • 值:从Node继承
  • Paren:从节点继承

所有节点都有一个称为eval()的虚函数。如果调用该函数,它将返回该子表达式的值。

回答

I won't give away the answer, but a
  Standard Bad Solution involves the use
  of a switch or case statment (or just
  good old-fashioned cascaded-ifs). A
  Slightly Better Solution involves
  using a table of function pointers,
  and the Probably Best Solution
  involves using polymorphism.

解释器的最后二十年的发展可以看作是将多态性(例如朴素的Smalltalk元圆解释器)转换成函数指针(朴素的Lisp实现,线程代码,C ++)再转换(朴素的字节码解释器),然后转向JIT等等需要非常大的类,或者(使用多态语言)(双态)双重调度,从而将多态性减少到一个典型的情况下,我们就回到了第一阶段。此处使用的"最佳"定义是什么?

对于简单的东西,多态解决方案是可以的,这是我之前提出的解决方案,但是如果要绘制一个具有数千个数据点的函数,那么堆栈和字节码/切换或者利用运行时的编译器通常会更好。

回答

我认为问题在于如何编写解析器,而不是评估器。或者更确切地说,如何从字符串创建表达式树。

返回基类的Case语句不完全计数。

"多态"解决方案的基本结构(换句话说,我不在乎我们用什么构建它,我只想通过重写尽可能少的代码来扩展它)就从一个对象序列反序列化了一个对象层次结构。具有一组(动态)已知类型的流。

实现多态解决方案的关键是要有一种从模式匹配器(可能是递归的)创建表达式对象的方法。即,将BNF或者类似语法映射到对象工厂。

回答

正如人们之前提到的,使用表达式树时不需要括号。当我们查看表达式树时,操作顺序变得不那么明显。括号是对解析器的提示。

虽然可接受的答案是解决一半问题的方法,但另一半实际上无法解析表达式的解决方案。通常,可以使用递归下降解析器解决这类问题。编写这样的解析器通常是一个有趣的练习,但是大多数用于语言解析的现代工具都会为我们提供抽象的工具。

如果在字符串中允许使用浮点数,则解析器的难度也将大大提高。我必须创建一个DFA才能接受C语言中的浮点数-这是一项非常艰苦而细致的任务。请记住,有效的浮点数包括:10、10、10.123、9.876e-5、1.0f,.025等。我认为对此进行了一些分配(有利于简洁)。

回答

嗯...我认为我们不能为此编写自上而下的解析器而无需回溯,因此它必须是某种移位减少解析器。 LR(1)甚至LALR当然可以在以下(临时)语言定义中正常工作:

开始-> E1
E1-> E1 + E1 | E1-E1
E1-> E2 * E2 | E2 / E2 | E2
E2->数字| (E1)

为了将*和/优先于+和-,将其分为E1和E2是必要的。

但这是我必须手工编写解析器的方式:

  • 两个堆栈,一个存储树的节点作为操作数,一个存储运算符
  • 从左到右读取输入,使数字成为叶节点,然后将其推入操作数堆栈。
  • 如果堆栈上有> = 2个操作数,请弹出2,将它们与操作员堆栈中最顶层的操作符组合在一起,然后将此结构推回到操作数树中,除非
  • 下一个运算符的优先级高于当前堆栈顶部的那个运算符。

这给我们留下了处理括号的问题。一种优雅的解决方案(我认为)是将每个运算符的优先级作为数字存储在变量中。所以最初

  • int plus,减= 1;
  • int mul,div = 2;

现在,每当我们看到一个左括号时,将所有这些变量加2,并且每当我们看到一个右括号时,将所有变量减2.

这将确保3 (4 + 5)中的+优先于,并且3 * 4不会被压入堆栈。相反,它将等待5,按4 + 5,然后按3 *(4 + 5)。

回答

我已经用一些基本技术编写了这样的解析器,例如
中缀-> RPN和
调车场和
树遍历。
这是我想出的实现。
它是用C ++编写的,并且可以在Linux和Windows上进行编译。
任何建议/问题都受到欢迎。

So, let's try to tackle the problem all three ways. How do you go from an arithmetic  expression (e.g. in a string) such as "2 + (2)" to an expression tree using cascaded-if's, a table of function pointers, and/or polymorphism?

这很有趣,但是我不认为这属于面向对象编程的领域...我认为它与解析技术有更多关系。

回答

我有点喜欢这个cconsole应用程序,只是有点概念上的证明。感觉可能会好得多(GetNode中的switch语句有点笨拙(在这里,我试图将类名映射到运算符时碰到了空白))任何有关如何改进它的建议都非常欢迎。

using System;

class Program
{
    static void Main(string[] args)
    {
        string expression = "(((3.5 * 4.5) / (1 + 2)) + 5)";
        Console.WriteLine(string.Format("{0} = {1}", expression, new Expression.ExpressionTree(expression).Value));
        Console.WriteLine("\nShow's over folks, press a key to exit");
        Console.ReadKey(false);
    }
}

namespace Expression
{
    // -------------------------------------------------------

    abstract class NodeBase
    {
        public abstract double Value { get; }
    }

    // -------------------------------------------------------

    class ValueNode : NodeBase
    {
        public ValueNode(double value)
        {
            _double = value;
        }

        private double _double;
        public override double Value
        {
            get
            {
                return _double;
            }
        }
    }

    // -------------------------------------------------------

    abstract class ExpressionNodeBase : NodeBase
    {
        protected NodeBase GetNode(string expression)
        {
            // Remove parenthesis
            expression = RemoveParenthesis(expression);

            // Is expression just a number?
            double value = 0;
            if (double.TryParse(expression, out value))
            {
                return new ValueNode(value);
            }
            else
            {
                int pos = ParseExpression(expression);
                if (pos > 0)
                {
                    string leftExpression = expression.Substring(0, pos - 1).Trim();
                    string rightExpression = expression.Substring(pos).Trim();

                    switch (expression.Substring(pos - 1, 1))
                    {
                        case "+":
                            return new Add(leftExpression, rightExpression);
                        case "-":
                            return new Subtract(leftExpression, rightExpression);
                        case "*":
                            return new Multiply(leftExpression, rightExpression);
                        case "/":
                            return new Divide(leftExpression, rightExpression);
                        default:
                            throw new Exception("Unknown operator");
                    }
                }
                else
                {
                    throw new Exception("Unable to parse expression");
                }
            }
        }

        private string RemoveParenthesis(string expression)
        {
            if (expression.Contains("("))
            {
                expression = expression.Trim();

                int level = 0;
                int pos = 0;

                foreach (char token in expression.ToCharArray())
                {
                    pos++;
                    switch (token)
                    {
                        case '(':
                            level++;
                            break;
                        case ')':
                            level--;
                            break;
                    }

                    if (level == 0)
                    {
                        break;
                    }
                }

                if (level == 0 && pos == expression.Length)
                {
                    expression = expression.Substring(1, expression.Length - 2);
                    expression = RemoveParenthesis(expression);
                }
            }
            return expression;
        }

        private int ParseExpression(string expression)
        {
            int winningLevel = 0;
            byte winningTokenWeight = 0;
            int winningPos = 0;

            int level = 0;
            int pos = 0;

            foreach (char token in expression.ToCharArray())
            {
                pos++;

                switch (token)
                {
                    case '(':
                        level++;
                        break;
                    case ')':
                        level--;
                        break;
                }

                if (level <= winningLevel)
                {
                    if (OperatorWeight(token) > winningTokenWeight)
                    {
                        winningLevel = level;
                        winningTokenWeight = OperatorWeight(token);
                        winningPos = pos;
                    }
                }
            }
            return winningPos;
        }

        private byte OperatorWeight(char value)
        {
            switch (value)
            {
                case '+':
                case '-':
                    return 3;
                case '*':
                    return 2;
                case '/':
                    return 1;
                default:
                    return 0;
            }
        }
    }

    // -------------------------------------------------------

    class ExpressionTree : ExpressionNodeBase
    {
        protected NodeBase _rootNode;

        public ExpressionTree(string expression)
        {
            _rootNode = GetNode(expression);
        }

        public override double Value
        {
            get
            {
                return _rootNode.Value;
            }
        }
    }

    // -------------------------------------------------------

    abstract class OperatorNodeBase : ExpressionNodeBase
    {
        protected NodeBase _leftNode;
        protected NodeBase _rightNode;

        public OperatorNodeBase(string leftExpression, string rightExpression)
        {
            _leftNode = GetNode(leftExpression);
            _rightNode = GetNode(rightExpression);

        }
    }

    // -------------------------------------------------------

    class Add : OperatorNodeBase
    {
        public Add(string leftExpression, string rightExpression)
            : base(leftExpression, rightExpression)
        {
        }

        public override double Value
        {
            get
            {
                return _leftNode.Value + _rightNode.Value;
            }
        }
    }

    // -------------------------------------------------------

    class Subtract : OperatorNodeBase
    {
        public Subtract(string leftExpression, string rightExpression)
            : base(leftExpression, rightExpression)
        {
        }

        public override double Value
        {
            get
            {
                return _leftNode.Value - _rightNode.Value;
            }
        }
    }

    // -------------------------------------------------------

    class Divide : OperatorNodeBase
    {
        public Divide(string leftExpression, string rightExpression)
            : base(leftExpression, rightExpression)
        {
        }

        public override double Value
        {
            get
            {
                return _leftNode.Value / _rightNode.Value;
            }
        }
    }

    // -------------------------------------------------------

    class Multiply : OperatorNodeBase
    {
        public Multiply(string leftExpression, string rightExpression)
            : base(leftExpression, rightExpression)
        {
        }

        public override double Value
        {
            get
            {
                return _leftNode.Value * _rightNode.Value;
            }
        }
    }
}