C语言如何输出使用ANTLR构建的AST？

Question

提问by Raphael

I'm making a static analyzer for C. I have done the lexer and parser using ANTLR in which generates Java code.

我正在为 C 制作一个静态分析器。我已经使用 ANTLR 完成了词法分析器和解析器，其中生成了 Java 代码。

Does ANTLR build the AST for us automatically by options {output=AST;}? Or do I have to make the tree myself? If it does, then how to spit out the nodes on that AST?

ANTLR 会自动为我们构建 ASToptions {output=AST;}吗？还是我必须自己做树？如果是，那么如何吐出该 AST 上的节点？

I am currently thinking that the nodes on that AST will be used for making SSA, followed by data flow analysis in order to make the static analyzer. Am I on the right path?

我目前认为该 AST 上的节点将用于制作 SSA，然后进行数据流分析以制作静态分析器。我在正确的道路上吗？

Answer 1

回答by Bart Kiers

Raphael wrote:
Does antlr build the AST for us automatically by option{output=AST;}? Or do I have to make the tree myself? If it does, then how to spit out the nodes on that AST?

拉斐尔写道：
antlr 是否通过 option{output=AST;} 自动为我们构建 AST？还是我必须自己做树？如果是，那么如何吐出该 AST 上的节点？

No, the parser does not know what you want as root and as leaves for each parser rule, so you'll have to do a bit more than just put options { output=AST; }in your grammar.

不，解析器不知道你想要什么作为每个解析器规则的根和叶，所以你必须做的不仅仅是options { output=AST; }输入你的语法。

For example, when parsing the source "true && (false || true && (true || false))"using the parser generated from the grammar:

例如，当"true && (false || true && (true || false))"使用从语法生成的解析器解析源时：

grammar ASTDemo;

options { 
  output=AST; 
}

parse
  :  orExp
  ;

orExp
  :  andExp ('||' andExp)*
  ;

andExp
  :  atom ('&&' atom)*
  ;

atom
  :  'true'
  |  'false'
  |  '(' orExp ')'
  ;

// ignore white space characters
Space
  :  (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}
  ;

the following parse treeis generated:

生成以下解析树：

enter image description here

在此处输入图片说明

(i.e. just a flat, 1 dimensional list of tokens)

（即只是一个扁平的一维令牌列表）

You'll want to tell ANTLR which tokens in your grammar become root, leaves, or simply left out of the tree.

您需要告诉 ANTLR 语法中的哪些标记成为树的根、叶或简单地被排除在树之外。

Creating AST's can be done in two ways:

可以通过两种方式创建 AST：

use rewrite rules which look like this: foo : A B C D -> ^(D A B);, where foois a parser rule that matches the tokens A B C D. So everything after the ->is the actual rewrite rule. As you can see, the token Cis not used in the rewrite rule, which means it is omitted from the AST. The token placed directly after the ^(will become the root of the tree;
use the tree-operators ^and !aftera token inside your parser rules where ^will make a token the root, and !will delete a token from the tree. The equivalent for foo : A B C D -> ^(D A B);would be foo : A B C! D^;

使用如下所示的重写规则：foo : A B C D -> ^(D A B);，其中foo是匹配标记的解析器规则A B C D。所以后面的一切->都是实际的重写规则。如您所见，C重写规则中未使用该令牌，这意味着它从 AST 中被省略了。紧接在之后的令牌^(将成为树的根；
使用树操作符^和解析器规则中的标记!之后，^将标记作为根，并!从树中删除标记。等效于foo : A B C D -> ^(D A B);将foo : A B C! D^;

Both foo : A B C D -> ^(D A B);and foo : A B C! D^;will produce the following AST:

双方foo : A B C D -> ^(D A B);并foo : A B C! D^;会产生以下的AST：

enter image description here

在此处输入图片说明

Now, you could rewrite the grammar as follows:

现在，您可以按如下方式重写语法：

grammar ASTDemo;

options { 
  output=AST; 
}

parse
  :  orExp
  ;

orExp
  :  andExp ('||'^ andExp)* // Make `||` root
  ;

andExp
  :  atom ('&&'^ atom)* // Make `&&` root
  ;

atom
  :  'true'
  |  'false'
  |  '(' orExp ')' -> orExp // Just a single token, no need to do `^(...)`, 
                            // we're removing the parenthesis. Note that
                            // `'('! orExp ')'!` will do exactly the same.
  ;

// ignore white space characters
Space
  :  (' ' | '\t' | '\r' | '\n') {$channel=HIDDEN;}
  ;

which will create the following ASTfrom the source "true && (false || true && (true || false))":

这将从源创建以下 AST"true && (false || true && (true || false))"：

enter image description here

在此处输入图片说明

EDIT

编辑

Here's how you can use the generated lexer and parser:

以下是使用生成的词法分析器和解析器的方法：

import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;

public class Main {
  public static void main(String[] args) throws Exception {
    String src = "true && (false || true && (true || false))";
    ASTDemoLexer lexer = new ASTDemoLexer(new ANTLRStringStream(src));
    ASTDemoParser parser = new ASTDemoParser(new CommonTokenStream(lexer));
    CommonTree tree = (CommonTree)parser.parse().getTree();
    DOTTreeGenerator gen = new DOTTreeGenerator();
    StringTemplate st = gen.toDOT(tree);
    System.out.println(st);
  }
}

C语言如何输出使用ANTLR构建的AST？

提问by Raphael

回答by Bart Kiers

EDIT

编辑

相关推荐

最近更新

标签

C语言 如何输出使用ANTLR构建的AST？

提问by Raphael

回答by Bart Kiers

EDIT

编辑

相关推荐

C语言 unsigned long long 类型以十六进制格式打印

C语言 C中printf函数的代码

C语言 “阻塞系统调用”是什么意思？

C语言 如何在C中暂停？

相关推荐

最近更新

标签

C语言如何输出使用ANTLR构建的AST？

C语言如何在C中暂停？