java 如何在语法上实现 JJTree
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13902239/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to implement JJTree on grammar
提问by Aimee Jones
I have an assignment to use JavaCC to make a Top-Down Parser with Semantic Analysis for a language supplied by the lecturer. I have the production rules written out and no errors. I'm completely stuck on how to use JJTree for my code and my hours of scouring the internet for tutorials hasn't gotten me anywhere. Just wondering could anyone take some time out to explain how to implement JJTree in the code? Or if there's a hidden step-by-step tutorial out there somewhere that would be a great help!
我有一个任务是使用 JavaCC 为讲师提供的语言制作一个带有语义分析的自上而下的解析器。我写出了生产规则,没有错误。我完全被困在如何将 JJTree 用于我的代码,而我在互联网上搜索教程的时间并没有让我获得任何帮助。只是想知道有人能花点时间解释一下如何在代码中实现 JJTree 吗?或者,如果某个地方有隐藏的分步教程,那将是一个很大的帮助!
Here are some of my production rules in case they help. Thanks in advance!
这是我的一些生产规则,以防万一。提前致谢!
void program() : {}
{
(decl())* (function())* main_prog()
}
void decl() #void : {}
{
(
var_decl() | const_decl()
)
}
void var_decl() #void : {}
{
<VAR> ident_list() <COLON> type()
(<COMMA> ident_list() <COLON> type())* <SEMIC>
}
void const_decl() #void : {}
{
<CONSTANT> identifier() <COLON> type() <EQUAL> expression()
( <COMMA> identifier() <COLON> type() <EQUAL > expression())* <SEMIC>
}
void function() #void : {}
{
type() identifier() <LBR> param_list() <RBR>
<CBL>
(decl())*
(statement() <SEMIC> )*
returnRule() (expression() | {} )<SEMIC>
<CBR>
}
回答by Bart Kiers
Creating an AST using JavaCC looks a lot like creating a "normal" parser (defined in a jj
file). If you already have a working grammar, it's (relatively) easy :)
使用 JavaCC 创建 AST 看起来很像创建“普通”解析器(在jj
文件中定义)。如果你已经有一个有效的语法,它(相对)容易:)
Here are the steps needed to create an AST:
以下是创建 AST 所需的步骤:
- rename your
jj
grammar file tojjt
- decorateit with root-labels(the italic words are my own terminology...)
- invoke
jjtree
on yourjjt
grammar, which will generate ajj
file for you - invoke
javacc
on your generatedjj
grammar - compile the generated
java
source files - test it
- 将您的
jj
语法文件重命名为jjt
- 用根标签装饰它(斜体字是我自己的术语......)
- 调用
jjtree
你的jjt
语法,它会jj
为你生成一个文件 - 调用
javacc
您生成的jj
语法 - 编译生成的
java
源文件 - 测试一下
Here's a quick step-by-step tutorial, assuming you're using MacOS or *nix, have the javacc.jar
file in the same directory as your grammar file(s) and java
and javac
are on your system's PATH:
这是一个快速的分步教程,假设您使用的是 MacOS 或 *nix,并且该javacc.jar
文件与您的语法文件位于同一目录中,java
并且javac
位于系统的 PATH 中:
1
1
Assuming your jj
grammar file is called TestParser.jj
, rename it:
假设你的jj
语法文件被称为TestParser.jj
,重命名它:
mv TestParser.jj TestParser.jjt
2
2
Now the tricky part: decoratingyour grammar so that the proper AST structure is created. You decoratean AST (or node, or production rule (all the same)) by adding a #
followed by an identifier after it (and before the :
). In your original question, you have a lot of #void
in different productions, meaning you're creating the same type of AST's for different production rules: this is not what you want.
现在是棘手的部分:修饰您的语法,以便创建正确的 AST 结构。您可以通过在 AST(或节点或生产规则(全部相同))之后(和之前)添加后跟标识符来修饰AST(或节点或生产规则(都一样))。在您最初的问题中,您有很多不同的产品,这意味着您正在为不同的产品规则创建相同类型的 AST:这不是您想要的。#
:
#void
If you don't decorateyour production, the name of the production is used as the type of the node (so, you can remove the #void
):
如果你不装饰你的生产,生产的名称将用作节点的类型(因此,你可以删除#void
):
void decl() :
{}
{
var_decl()
| const_decl()
}
Now the rule simply returns whatever AST the rule var_decl()
or const_decl()
returned.
现在规则简单地返回规则var_decl()
或返回的任何 AST const_decl()
。
Let's now have a look at the (simplified) var_decl
rule:
现在让我们看一下(简化的)var_decl
规则:
void var_decl() #VAR :
{}
{
<VAR> id() <COL> id() <EQ> expr() <SCOL>
}
void id() #ID :
{}
{
<ID>
}
void expr() #EXPR :
{}
{
<ID>
}
which I decorated with the #VAR
type. This now means that this rule will return the following tree structure:
我用#VAR
类型装饰。这现在意味着此规则将返回以下树结构:
VAR
/ | \
/ | \
ID ID EXPR
As you can see, the terminals are discarded from the AST! This also means that the id
and expr
rules loose the text their <ID>
terminal matched. Of course, this is not what you want. For the rules that need to keep the inner text the terminal matched, you need to explicitly set the .value
of the tree to the .image
of the matched terminal:
如您所见,终端已从 AST 中丢弃!这也意味着id
和expr
规则会丢失其<ID>
终端匹配的文本。当然,这不是你想要的。对于需要保持终端匹配的内部文本的规则,您需要.value
将树的显式设置.image
为匹配终端的 :
void id() #ID :
{Token t;}
{
t=<ID> {jjtThis.value = t.image;}
}
void expr() #EXPR :
{Token t;}
{
t=<ID> {jjtThis.value = t.image;}
}
causing the input "var x : int = i;"
to look like this:
导致输入"var x : int = i;"
看起来像这样:
VAR
|
.---+------.
/ | \
/ | \
ID["x"] ID["int"] EXPR["i"]
This is how you create a proper structure for your AST. Below follows a small grammar that is a very simple version of your own grammar including a small main
method to test it all:
这就是为 AST 创建适当结构的方式。下面是一个小语法,它是你自己语法的一个非常简单的版本,包括一个main
测试它的小方法:
// TestParser.jjt
PARSER_BEGIN(TestParser)
public class TestParser {
public static void main(String[] args) throws ParseException {
TestParser parser = new TestParser(new java.io.StringReader(args[0]));
SimpleNode root = parser.program();
root.dump("");
}
}
PARSER_END(TestParser)
TOKEN :
{
< OPAR : "(" >
| < CPAR : ")" >
| < OBR : "{" >
| < CBR : "}" >
| < COL : ":" >
| < SCOL : ";" >
| < COMMA : "," >
| < VAR : "var" >
| < EQ : "=" >
| < CONST : "const" >
| < ID : ("_" | <LETTER>) ("_" | <ALPHANUM>)* >
}
TOKEN :
{
< #DIGIT : ["0"-"9"] >
| < #LETTER : ["a"-"z","A"-"Z"] >
| < #ALPHANUM : <LETTER> | <DIGIT> >
}
SKIP : { " " | "\t" | "\r" | "\n" }
SimpleNode program() #PROGRAM :
{}
{
(decl())* (function())* <EOF> {return jjtThis;}
}
void decl() :
{}
{
var_decl()
| const_decl()
}
void var_decl() #VAR :
{}
{
<VAR> id() <COL> id() <EQ> expr() <SCOL>
}
void const_decl() #CONST :
{}
{
<CONST> id() <COL> id() <EQ> expr() <SCOL>
}
void function() #FUNCTION :
{}
{
type() id() <OPAR> params() <CPAR> <OBR> /* ... */ <CBR>
}
void type() #TYPE :
{Token t;}
{
t=<ID> {jjtThis.value = t.image;}
}
void id() #ID :
{Token t;}
{
t=<ID> {jjtThis.value = t.image;}
}
void params() #PARAMS :
{}
{
(param() (<COMMA> param())*)?
}
void param() #PARAM :
{Token t;}
{
t=<ID> {jjtThis.value = t.image;}
}
void expr() #EXPR :
{Token t;}
{
t=<ID> {jjtThis.value = t.image;}
}
3
3
Let the jjtree
class (included in javacc.jar
) create a jj
file for you:
让jjtree
类(包含在 中javacc.jar
)jj
为您创建一个文件:
java -cp javacc.jar jjtree TestParser.jjt
4
4
The previous step has created the file TestParser.jj
(if everything went okay). Let javacc
(also present in javacc.jar
) process it:
上一步已经创建了文件TestParser.jj
(如果一切顺利的话)。让javacc
(也出现在 中javacc.jar
)处理它:
java -cp javacc.jar javacc TestParser.jj
5
5
To compile all source files, do:
要编译所有源文件,请执行以下操作:
javac -cp .:javacc.jar *.java
(on Windows, do: javac -cp .;javacc.jar *.java
)
(在Windows上,做的:javac -cp .;javacc.jar *.java
)
6
6
The moment of truth has arrived: let's see if everything actually works! To let the parser process the input:
关键时刻已经到来:让我们看看一切是否真的有效!让解析器处理输入:
var n : int = I;
const x : bool = B;
double f(a,b,c)
{
}
execute the following:
执行以下操作:
java -cp . TestParser "var n : int = I; const x : bool = B; double f(a,b,c) { }"
and you should see the following being printed on your console:
您应该会在控制台上看到以下内容:
PROGRAM decl VAR ID ID EXPR decl CONST ID ID EXPR FUNCTION TYPE ID PARAMS PARAM PARAM PARAM
Note that you don't see the text the ID
's matched, but believe me, they're there. The method dump()
simply does not show it.
请注意,您没有看到ID
匹配的文本,但相信我,它们就在那里。该方法dump()
根本没有显示它。
HTH
HTH
EDIT
编辑
For a working grammar including expressions, you could have a look at the following expression evaluator of mine: https://github.com/bkiers/Curta(the grammar is in src/grammar
). You might want to have a look at how to create root-nodes in case of binary expressions.
对于包含表达式的工作语法,您可以查看我的以下表达式评估器:https: //github.com/bkiers/Curta(语法在 中src/grammar
)。您可能想看看如何在二进制表达式的情况下创建根节点。
回答by Anand Rajasekar
Here is an example that uses JJTree http://anandsekar.github.io/writing-an-interpretter-using-javacc/
这是一个使用 JJTree 的例子 http://anandsekar.github.io/writing-an-interpretter-using-javacc/