C语言 如何输出使用ANTLR构建的AST?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4931346/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 07:44:20  来源:igfitidea点击:

How to output the AST built using ANTLR?

cantlrstatic-analysisabstract-syntax-tree

提问by Raphael

I'm making a static analyzer for C. I have done the lexer and parser using ANTLR in which generates Java code.

我正在为 C 制作一个静态分析器。我已经使用 ANTLR 完成了词法分析器和解析器,其中生成了 Java 代码。

Does ANTLR build the AST for us automatically by options {output=AST;}? Or do I have to make the tree myself? If it does, then how to spit out the nodes on that AST?

ANTLR 会自动为我们构建 ASToptions {output=AST;}吗?还是我必须自己做树?如果是,那么如何吐出该 AST 上的节点?

I am currently thinking that the nodes on that AST will be used for making SSA, followed by data flow analysis in order to make the static analyzer. Am I on the right path?

我目前认为该 AST 上的节点将用于制作 SSA,然后进行数据流分析以制作静态分析器。我在正确的道路上吗?

回答by Bart Kiers

Raphael wrote:

Does antlr build the AST for us automatically by option{output=AST;}? Or do I have to make the tree myself? If it does, then how to spit out the nodes on that AST?

拉斐尔写道:

antlr 是否通过 option{output=AST;} 自动为我们构建 AST?还是我必须自己做树?如果是,那么如何吐出该 AST 上的节点?

No, the parser does not know what you want as root and as leaves for each parser rule, so you'll have to do a bit more than just put options { output=AST; }in your grammar.

不,解析器不知道你想要什么作为每个解析器规则的根和叶,所以你必须做的不仅仅是options { output=AST; }输入你的语法。

For example, when parsing the source "true && (false || true && (true || false))"using the parser generated from the grammar:

例如,当"true && (false || true && (true || false))"使用从语法生成的解析器解析源时:

grammar ASTDemo;

options { 
  output=AST; 
}

parse
  :  orExp
  ;

orExp
  :  andExp ('||' andExp)*
  ;

andExp
  :  atom ('&&' atom)*
  ;

atom
  :  'true'
  |  'false'
  |  '(' orExp ')'
  ;

// ignore white space characters
Space
  :  (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}
  ;

the following parse treeis generated:

生成以下解析树

enter image description here

在此处输入图片说明

(i.e. just a flat, 1 dimensional list of tokens)

(即只是一个扁平的一维令牌列表)

You'll want to tell ANTLR which tokens in your grammar become root, leaves, or simply left out of the tree.

您需要告诉 ANTLR 语法中的哪些标记成为树的根、叶或简单地被排除在树之外。

Creating AST's can be done in two ways:

可以通过两种方式创建 AST:

  1. use rewrite rules which look like this: foo : A B C D -> ^(D A B);, where foois a parser rule that matches the tokens A B C D. So everything after the ->is the actual rewrite rule. As you can see, the token Cis not used in the rewrite rule, which means it is omitted from the AST. The token placed directly after the ^(will become the root of the tree;
  2. use the tree-operators ^and !aftera token inside your parser rules where ^will make a token the root, and !will delete a token from the tree. The equivalent for foo : A B C D -> ^(D A B);would be foo : A B C! D^;
  1. 使用如下所示的重写规则:foo : A B C D -> ^(D A B);,其中foo是匹配标记的解析器规则A B C D。所以后面的一切->都是实际的重写规则。如您所见,C重写规则中未使用该令牌,这意味着它从 AST 中被省略了。紧接在 之后的令牌^(将成为树的根;
  2. 使用树操作符^和解析器规则中的标记!之后^将标记作为根,并!从树中删除标记。等效于foo : A B C D -> ^(D A B);foo : A B C! D^;

Both foo : A B C D -> ^(D A B);and foo : A B C! D^;will produce the following AST:

双方foo : A B C D -> ^(D A B);foo : A B C! D^;会产生以下的AST:

enter image description here

在此处输入图片说明

Now, you could rewrite the grammar as follows:

现在,您可以按如下方式重写语法:

grammar ASTDemo;

options { 
  output=AST; 
}

parse
  :  orExp
  ;

orExp
  :  andExp ('||'^ andExp)* // Make `||` root
  ;

andExp
  :  atom ('&&'^ atom)* // Make `&&` root
  ;

atom
  :  'true'
  |  'false'
  |  '(' orExp ')' -> orExp // Just a single token, no need to do `^(...)`, 
                            // we're removing the parenthesis. Note that
                            // `'('! orExp ')'!` will do exactly the same.
  ;

// ignore white space characters
Space
  :  (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}
  ;

which will create the following ASTfrom the source "true && (false || true && (true || false))":

这将从源创建以下 AST"true && (false || true && (true || false))"

enter image description here

在此处输入图片说明

Related ANTLR wiki links:

相关的 ANTLR 维基链接:

Raphael wrote:

I am currently thinking that the nodes on that AST will be used for making SSA, followed by data flow analysis in order to make the static analyzer. Am I on the right path?

拉斐尔写道:

我目前认为该 AST 上的节点将用于制作 SSA,然后进行数据流分析以制作静态分析器。我在正确的道路上吗?

Never did anything like that, but IMO the first thing you'd want is an AST from the source, so yeah, I guess your on the right path! :)

从来没有做过这样的事情,但是 IMO 你想要的第一件事是来自源头的 AST,所以是的,我猜你走在正确的道路上!:)

EDIT

编辑

Here's how you can use the generated lexer and parser:

以下是使用生成的词法分析器和解析器的方法:

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = "true && (false || true && (true || false))";
    ASTDemoLexer lexer = new ASTDemoLexer(new ANTLRStringStream(src));
    ASTDemoParser parser = new ASTDemoParser(new CommonTokenStream(lexer));
    CommonTree tree = (CommonTree)parser.parse().getTree();
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT(tree);
    System.out.println(st);
  }
}