在 C# 中行走 ANTLR AST 的教程?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/887205/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Tutorial for walking ANTLR ASTs in C#?
提问by kpozin
Is anyone aware of tutorials for walking ANTLR-generated ASTs in C#? The closest I was able to find is this, but it's not terribly helpful.
有没有人知道在 C# 中行走 ANTLR 生成的 AST 的教程?我能找到的最接近的是this,但这并不是很有帮助。
My goal is to walk through trees that I'm generating based on a domain-specific language that I'm working on, and to use the trees to output generated C# code.
我的目标是遍历我基于我正在研究的特定领域语言生成的树,并使用这些树输出生成的 C# 代码。
A Java-based tutorial would be helpful, too -- anything that provides clear examples of how to traverse ANTLR ASTs.
一个基于 Java 的教程也会有帮助——任何提供如何遍历 ANTLR AST 的清晰示例的东西。
采纳答案by kpozin
I managed to figure this out by adapting the example at the end of Manuel Abadia's article.
我设法通过改编Manuel Abadia 文章末尾的例子来解决这个问题。
Here's my version, which I happen to be using to convert parsed code to C#. These are the steps:
这是我的版本,我碰巧用来将解析的代码转换为 C#。这些是步骤:
- Instantiate an ANTLRStringStreamor subclass with your input (it can be a file or string).
- Instantiate your generated lexer, passing in that string stream.
- Instantiate a token stream with the lexer.
- Instantiate your parser with that token stream.
- Get the top-level value from your parser, and turn it into a
CommonTree
. - Traverse the tree:
- 使用您的输入实例化ANTLRStringStream或子类(它可以是文件或字符串)。
- 实例化您生成的词法分析器,传入该字符串流。
- 使用词法分析器实例化令牌流。
- 使用该令牌流实例化您的解析器。
- 从解析器中获取顶级值,并将其转换为
CommonTree
. - 遍历树:
To get the literal text of a node, use node.Text
.
To get the token name of a node, use node.Token.Text
.
要获取节点的文字文本,请使用node.Text
. 要获取节点的令牌名称,请使用node.Token.Text
.
Note that node.Token.Text
will only give you the actual name of your token if it's an imaginary token with no corresponding string. If it's a real token, then node.Token.Text
will return its string.
请注意,node.Token.Text
如果它是一个没有相应字符串的虚构令牌,它只会为您提供令牌的实际名称。如果它是一个真正的令牌,node.Token.Text
则将返回其字符串。
For example, if you had the following in your grammar:
例如,如果您的语法中有以下内容:
tokens { PROGRAM, FUNCDEC }
EQUALS : '==';
ASSIGN : '=';
Then you'll get "PROGRAM"
, "FUNCDEC"
, "=="
, and "="
from the corresponding accesses of node.Token.Text
.
然后您将获得"PROGRAM"
, "FUNCDEC"
, "=="
, 和"="
的相应访问node.Token.Text
。
You can see part of my example below, or you can browse the full version.
你可以在下面看到我的例子的一部分,或者你可以浏览完整版。
public static string Convert(string input)
{
ANTLRStringStream sStream = new ANTLRStringStream(input);
MyGrammarLexer lexer = new MyGrammarLexer(sStream);
CommonTokenStream tStream = new CommonTokenStream(lexer);
MyGrammarParser parser = new MyGrammarParser (tStream);
MyGrammarParser.program_return parserResult = parser.program();
CommonTree ast = (CommonTree)parserResult.Tree;
Print(ast);
string output = header + body + footer;
return output;
}
public static void PrintChildren(CT ast)
{
PrintChildren(ast, " ", true);
}
public static void PrintChildren(CT ast, string delim, bool final)
{
if (ast.Children == null)
{
return;
}
int num = ast.Children.Count;
for (int i = 0; i < num; ++i)
{
CT d = (CT)(ast.Children[i]);
Print(d);
if (final || i < num - 1)
{
body += delim;
}
}
}
public static void Print(CommonTree ast)
{
switch (ast.Token.Text)
{
case "PROGRAM":
//body += header;
PrintChildren(ast);
//body += footer;
break;
case "GLOBALS":
body += "\r\n\r\n// GLOBALS\r\n";
PrintChildren(ast);
break;
case "GLOBAL":
body += "public static ";
PrintChildren(ast);
body += ";\r\n";
break;
....
}
}
回答by Barry Kelly
Normally you walk ASTs with recursion, and perform different actions based on the kind of the node. If you're using polymorphic tree nodes (i.e. different subclasses for different nodes in the tree), then double-dispatch in the Visitor pattern may be appropriate; however, that's usually not very convenient with Antlr.
通常,您使用递归遍历 AST,并根据节点的类型执行不同的操作。如果您使用的是多态树节点(即树中不同节点的不同子类),那么访问者模式中的双调度可能是合适的;然而,这对于 Antlr 来说通常不是很方便。
In pseudocode, walking usually looks somewhat like this:
在伪代码中,步行通常看起来有点像这样:
func processTree(t)
case t.Type of
FOO: processFoo t
BAR: processBar t
end
// a post-order process
func processFoo(foo)
// visit children
for (i = 0; i < foo.ChildCount; ++i)
processTree(foo.GetChild(i))
// visit node
do_stuff(foo.getText())
// a pre-order process
func processBoo(bar)
// visit node
do_stuff(bar.getText())
// visit children
for (i = 0; i < foo.ChildCount; ++i)
processTree(foo.GetChild(i))
The kinds of processing are highly dependent on the semantics of the language. For example, handling an IF
statement, with structure (IF <predicate> <if-true> [<if-false>])
, when generating code for a stack machine like the JVM or CLR, might look somewhat like this:
处理的种类高度依赖于语言的语义。例如,在为 JVM 或 CLR 等堆栈机器生成代码时,处理IF
具有结构的语句(IF <predicate> <if-true> [<if-false>])
可能看起来像这样:
func processIf(n)
predicate = n.GetChild(0)
processExpr(predicate) // get predicate value on stack
falseLabel = createLabel()
genCode(JUMP_IF_FALSE, falseLabel) // JUMP_IF_FALSE is called brfalse in CLR,
// ifeq in JVM
if_true = n.GetChild(1)
processStmt(if_true)
if_false = n.ChildCount > 2 ? n.GetChild(2) : null
if (if_false != null)
doneLabel = createLabel()
genCode(JUMP, doneLabel)
markLabel(falseLabel)
if (if_false != null)
processStmt(if_false) // if-false branch
markLabel(doneLabel)
Generally everything is done recursively depending on the type of the current node, etc.
通常,一切都是根据当前节点的类型等递归完成的。
回答by Scott Stanchfield
You should look into writing a TreeParser; it can make the job of interpreting the tree much simpler.
您应该考虑编写一个 TreeParser;它可以使解释树的工作变得更加简单。
For ANTLR 2.x see http://www.antlr2.org/doc/sor.htmlFor ANTLR 3.x see http://www.antlr.org/wiki/display/ANTLR3/Tree+construction(java-based parser and tree parser example)
对于 ANTLR 2.x,请参见http://www.antlr2.org/doc/sor.html对于 ANTLR 3.x,请参见http://www.antlr.org/wiki/display/ANTLR3/Tree+construction(基于 java解析器和树解析器示例)
回答by Jasper Floor
I did something similar (but not really) and I ended up with a TreeParser.
我做了类似的事情(但不是真的),最后我得到了一个 TreeParser。
I also suggest buying the ANTLR book. I found it to be more valuable than any web resource. It may not have all the answers but it sure helps with the basics.
我还建议购买 ANTLR 书。我发现它比任何网络资源都更有价值。它可能没有所有的答案,但它肯定有助于基础知识。