java 解析类似 SQL 的语法、设计模式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/10379956/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Parsing SQL like syntax, design pattern
提问by Hyman_carver
I am trying mock sql syntax to build a simple sql like interface to a key-value storage. The values are essentially POJOs
我正在尝试模拟 sql 语法来构建一个简单的类似于 sql 的键值存储接口。这些值本质上是 POJO
An example would be
一个例子是
select A.B.C from OBJ_POOL where A.B.X = 45 AND A.B.Y > '88' AND A.B.Z != 'abc';
OBJ_POOL is just a list of POJOs of the same class. In this example A would be the base class.
OBJ_POOL 只是同一类 POJO 的列表。在这个例子中,A 是基类。
Class A
Class B
String C
Integer X
String Y
String Z
Now A.B.C is equivalent A.getB().getC()
现在 ABC 等价于 A.getB().getC()
I am using Antlr to parse the above statement to get an AST, and then hoping
to use Apache BeanUtils to reflectively get/set the field names.
我使用 Antlr 来解析上述语句以获取 AST,然后hoping
使用 Apache BeanUtils 反射地获取/设置字段名称。
I wrote the grammar thats builds an AST
Now I am facing two problems
我写了构建 AST 的语法
现在我面临两个问题
- How should the visitor be implemented for the where clause ? A.B.X = 45 implies all objects having field X as 45, how should the filtering happen is there any nice way to do this ?
- Is there any way to traverse the generated AST without cluttering the visitor code with custom logic (storage access,property getters/setters etc..)
- 对于 where 子句,应该如何实现访问者?ABX = 45 意味着所有对象的字段 X 为 45,过滤应该如何发生,有什么好的方法可以做到这一点?
- 有没有什么方法可以遍历生成的 AST,而不会用自定义逻辑(存储访问、属性 getter/setter 等)弄乱访问者代码。
The second problem is more worrying since there might be many things that the statement might do.
第二个问题更令人担忧,因为该语句可能会做很多事情。
In a nutshell any suggestions/links/design-patterns to nicely parse a small subset of the sql select statment would be greatly appreciated
简而言之,任何可以很好地解析 sql select 语句的一小部分的建议/链接/设计模式将不胜感激
Thanks
谢谢
回答by Bart Kiers
You can do this much like how I demonstrated in my blog posts(and since I know you read those, I won't go in much detail). The only difference in this case is that each of your rows of data has its own scope. An easy way to pass this scope along is by providing it as a parameter to the eval(...)
method.
你可以像我在我的博客文章中展示的那样做这件事(因为我知道你读过那些,我不会详细介绍)。在这种情况下唯一的区别是您的每一行数据都有自己的范围。传递此范围的一种简单方法是将其作为参数提供给eval(...)
方法。
Below is a quick demo of how this could be implemented. Note that I quickly hacked this together based on my blog posts: not all functionality is available (see the many TODO
's, and there are likely (small) bugs in it as well. Use at your own risk!).
下面是如何实现这一点的快速演示。请注意,我根据我的博客文章迅速将其合并:并非所有功能都可用(请参阅许多功能TODO
,并且其中也可能存在(小)错误。使用风险自负!)。
Besides ANTLR v3.3, you need the following 3 files for this demo:
除了 ANTLR v3.3,本演示还需要以下 3 个文件:
Select.g
选择.g
grammar Select;
options {
output=AST;
}
tokens {
// imaginary tokens
ROOT;
ATTR_LIST;
UNARY_MINUS;
// literal tokens
Eq = '=';
NEq = '!=';
LT = '<';
LTEq = '<=';
GT = '>';
GTEq = '>=';
Minus = '-';
Not = '!';
Select = 'select';
From = 'from';
Where = 'where';
And = 'AND';
Or = 'OR';
}
parse
: select_stat EOF -> ^(ROOT select_stat)
;
select_stat
: Select attr_list From Id where_stat ';' -> ^(Select attr_list Id where_stat)
;
attr_list
: Id (',' Id)* -> ^(ATTR_LIST Id+)
;
where_stat
: Where expr -> expr
| -> ^(Eq Int["1"] Int["1"])
// no 'where', insert '1=1' which is always true
;
expr
: or_expr
;
or_expr
: and_expr (Or^ and_expr)*
;
and_expr
: eq_expr (And^ eq_expr)*
;
eq_expr
: rel_expr ((Eq | NEq)^ rel_expr)*
;
rel_expr
: unary_expr ((LT | LTEq | GT | GTEq)^ unary_expr)?
;
unary_expr
: Minus atom -> ^(UNARY_MINUS atom)
| Not atom -> ^(Not atom)
| atom
;
atom
: Str
| Int
| Id
| '(' expr ')' -> expr
;
Id : ('a'..'z' | 'A'..'Z' | '_') ('a'..'z' | 'A'..'Z' | '_' | Digit)*;
Str : '\'' ('\'\'' | ~('\'' | '\r' | '\n'))* '\''
{
// strip the surrounding quotes and replace '' with '
setText($text.substring(1, $text.length() - 1).replace("''", "'"));
}
;
Int : Digit+;
Space : (' ' | '\t' | '\r' | '\n') {skip();};
fragment Digit : '0'..'9';
SelectWalker.g
选择Walker.g
tree grammar SelectWalker;
options {
tokenVocab=Select;
ASTLabelType=CommonTree;
}
@header {
import java.util.List;
import java.util.Map;
import java.util.Set;
}
@members {
private Map<String, List<B>> dataPool;
public SelectWalker(CommonTreeNodeStream nodes, Map<String, List<B>> data) {
super(nodes);
dataPool = data;
}
}
query returns [List<List<Object>> result]
: ^(ROOT select_stat) {$result = (List<List<Object>>)$select_stat.node.eval(null);}
;
select_stat returns [Node node]
: ^(Select attr_list Id expr)
{$node = new SelectNode($attr_list.attributes, dataPool.get($Id.text), $expr.node);}
;
attr_list returns [List<String> attributes]
@init{$attributes = new ArrayList<String>();}
: ^(ATTR_LIST (Id {$attributes.add($Id.text);})+)
;
expr returns [Node node]
: ^(Or a=expr b=expr) {$node = null; /* TODO */}
| ^(And a=expr b=expr) {$node = new AndNode($a.node, $b.node);}
| ^(Eq a=expr b=expr) {$node = new EqNode($a.node, $b.node);}
| ^(NEq a=expr b=expr) {$node = new NEqNode($a.node, $b.node);}
| ^(LT a=expr b=expr) {$node = null; /* TODO */}
| ^(LTEq a=expr b=expr) {$node = null; /* TODO */}
| ^(GT a=expr b=expr) {$node = new GTNode($a.node, $b.node);}
| ^(GTEq a=expr b=expr) {$node = null; /* TODO */}
| ^(UNARY_MINUS a=expr) {$node = null; /* TODO */}
| ^(Not a=expr) {$node = null; /* TODO */}
| Str {$node = new AtomNode($Str.text);}
| Int {$node = new AtomNode(Integer.valueOf($Int.text));}
| Id {$node = new IdNode($Id.text);}
;
Main.java
主程序
(yes, stick all these Java classes in the same file: Main.java
)
(是的,坚持所有这些Java类在同一个文件:Main.java
)
import org.antlr.runtime.*;
import org.antlr.runtime.tree.*;
import org.antlr.stringtemplate.*;
import java.util.*;
public class Main {
static Map<String, List<B>> getData() {
Map<String, List<B>> map = new HashMap<String, List<B>>();
List<B> data = new ArrayList<B>();
data.add(new B("id_1", 345, "89", "abd"));
data.add(new B("id_2", 45, "89", "abd"));
data.add(new B("id_3", 1, "89", "abd"));
data.add(new B("id_4", 45, "8", "abd"));
data.add(new B("id_5", 45, "89", "abc"));
data.add(new B("id_6", 45, "99", "abC"));
map.put("poolX", data);
return map;
}
public static void main(String[] args) throws Exception {
String src = "select C, Y from poolX where X = 45 AND Y > '88' AND Z != 'abc';";
SelectLexer lexer = new SelectLexer(new ANTLRStringStream(src));
SelectParser parser = new SelectParser(new CommonTokenStream(lexer));
CommonTree tree = (CommonTree)parser.parse().getTree();
SelectWalker walker = new SelectWalker(new CommonTreeNodeStream(tree), getData());
List<List<Object>> result = walker.query();
for(List<Object> row : result) {
System.out.println(row);
}
}
}
class B {
String C;
Integer X;
String Y;
String Z;
B(String c, Integer x, String y, String z) {
C = c;
X = x;
Y = y;
Z = z;
}
Object getAttribute(String attribute) {
if(attribute.equals("C")) return C;
if(attribute.equals("X")) return X;
if(attribute.equals("Y")) return Y;
if(attribute.equals("Z")) return Z;
throw new RuntimeException("Unknown attribute: B." + attribute);
// or use your Apache Bean-util API, or even reflection here instead of the above...
}
}
interface Node {
Object eval(B b);
}
class AtomNode implements Node {
final Object value;
AtomNode(Object v) {
value = v;
}
public Object eval(B b) {
return value;
}
}
abstract class BinNode implements Node {
final Node left;
final Node right;
BinNode(Node l, Node r) {
left = l;
right = r;
}
public abstract Object eval(B b);
}
class AndNode extends BinNode {
AndNode(Node l, Node r) {
super(l, r);
}
@Override
public Object eval(B b) {
return (Boolean)super.left.eval(b) && (Boolean)super.right.eval(b);
}
}
class EqNode extends BinNode {
EqNode(Node l, Node r) {
super(l, r);
}
@Override
public Object eval(B b) {
return super.left.eval(b).equals(super.right.eval(b));
}
}
class NEqNode extends BinNode {
NEqNode(Node l, Node r) {
super(l, r);
}
@Override
public Object eval(B b) {
return !super.left.eval(b).equals(super.right.eval(b));
}
}
class GTNode extends BinNode {
GTNode(Node l, Node r) {
super(l, r);
}
@Override
public Object eval(B b) {
return ((Comparable)super.left.eval(b)).compareTo((Comparable)super.right.eval(b)) > 0;
}
}
class IdNode implements Node {
final String id;
IdNode(String i) {
id = i;
}
@Override
public Object eval(B b) {
return b.getAttribute(id);
}
}
class SelectNode implements Node {
final List<String> attributes;
final List<B> data;
final Node expression;
SelectNode(List<String> a, List<B> d, Node e) {
attributes = a;
data = d;
expression = e;
}
@Override
public Object eval(B ignored) {
List<List<Object>> result = new ArrayList<List<Object>>();
for(B b : data) {
if((Boolean)expression.eval(b)) {
// 'b' passed, check which attributes to include
List<Object> row = new ArrayList<Object>();
for(String attr : attributes) {
row.add(b.getAttribute(attr));
}
result.add(row);
}
}
return result;
}
}
If you now generate the lexer, parser and tree walker and run the Main class:
如果您现在生成词法分析器、解析器和树遍历器并运行 Main 类:
java -cp antlr-3.3.jar org.antlr.Tool Select.g
java -cp antlr-3.3.jar org.antlr.Tool SelectWalker.g
javac -cp antlr-3.3.jar *.java
java -cp .:antlr-3.3.jar Main
you will see that the output for the query:
您将看到查询的输出:
select C, Y from poolX where X = 45 AND Y > '88' AND Z != 'abc';
with input:
有输入:
C X Y Z
"id_1" 345 "89" "abd"
"id_2" 45 "89" "abd"
"id_3" 1 "89" "abd"
"id_4 45 "8" "abd"
"id_5" 45 "89" "abc"
"id_6" 45 "99" "abC"
is:
是:
[id_2, 89]
[id_6, 99]
And note that if the where
statement is omitted, the expression 1 = 1
is automatically inserted, causing the query:
并注意,如果where
省略该语句,则表达式1 = 1
会自动插入,导致查询:
select C, Y from poolX;
to print the following:
打印以下内容:
[id_1, 89]
[id_2, 89]
[id_3, 89]
[id_4, 8]
[id_5, 89]
[id_6, 99]