java 提取SQL查询的正则表达式
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16672539/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Regular expression to extract SQL query
提问by Velth
Is there a regex which extracts SQL queries from a string? I'm NOTinterested to validate any SQL syntax, rather and only extracting a selection of SQL commands. This to parse a given SQL file/string in a flexible manner.
是否有从字符串中提取 SQL 查询的正则表达式?我对验证任何 SQL 语法不感兴趣,而只是提取一组 SQL 命令。这以灵活的方式解析给定的 SQL 文件/字符串。
Given is the following SQL file/string example:
给出以下 SQL 文件/字符串示例:
SELECT
*
FROM
test_table
WHERE
test_row = 'Testing ; semicolon';
SELECT * FROM another_test_table;
INSERT INTO
table_name
VALUES
(value1,'value which contains semicolon ;;;;',value3,...);
Some pseudocode example would be: ^(UPDATE|SELECT|INSERT INTO)(.*)(;)$
. In the future i'm looking to extend this with all (possible) commands.
一些伪例子是:^(UPDATE|SELECT|INSERT INTO)(.*)(;)$
。将来我希望用所有(可能的)命令扩展它。
- Look for a starting match with either: (UPDATE|SELECT|INSERT|INTO)
- Zero or more
any character
(including whitespaces and newlines) - Stop at
;
, which delimits the SQL query.
- 寻找与以下任一者的起始匹配:(UPDATE|SELECT|INSERT|INTO)
- 零个或多个
any character
(包括空格和换行符) - 停在
;
,它分隔 SQL 查询。
Whenever this would be possible via a regex the following java code is able to extract all SQL commands:
只要可以通过正则表达式,以下 java 代码就可以提取所有 SQL 命令:
final String regex = "LOOKING_FOR_THIS_ONE";
final Pattern p = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = p.matcher(content);
while (matcher.find()) {
// matcher.group() now contains the full SQL command
}
Thanks in advance!
提前致谢!
回答by melwil
I'll start off by saying that this is not a good way of doing it, and strongly urge you to find another method of doing it, preferrably tagging it properly where the statements are made, so you don't end up in this situation.
我首先要说这不是一个好方法,并强烈敦促您找到另一种方法来做这件事,最好在声明的地方正确标记它,这样您就不会陷入这种情况.
That being said, SQL requires it to start with one of the following; DELETE
, SELECT
, WITH
, UPDATE
or INSERT INTO
. It also requires that the input ends with ;
.
话虽如此,SQL 要求它以下列之一开始;DELETE
、SELECT
、WITH
、UPDATE
或INSERT INTO
。它还要求输入以;
.
We can use this to grab all sequences matching SQL with the following:
我们可以使用它来获取与 SQL 匹配的所有序列,如下所示:
final String regex = "^(INSERT INTO|UPDATE|SELECT|WITH|DELETE)(?:[^;']|(?:'[^']+'))+;\s*$";
final Pattern p = Pattern.compile(regex, Pattern.MULTILINE | Pattern.DOTALL);
Group 1 now holds the operating word, in case you wish to filter valid SQL on UPDATE
or SELECT
.
第 1 组现在包含操作字,以防您希望在UPDATE
或上过滤有效的 SQL SELECT
。
See the regex in action, as well as a cave-at here:
查看正则表达式,以及这里的一个提示:
回答by Bart
You can match it "properly" as long as the semicolon is the last non-whitespace character on that line.
只要分号是该行的最后一个非空白字符,您就可以“正确”匹配它。
final String regex = ^(SELECT|UPDATE|INSERT)[\s\S]+?\;\s*?$
final Pattern p = Pattern.compile(regex, Pattern.MULTILINE);
final Matcher matcher = p.matcher(content);
回答by Jerome
SQL is complicated enough that you will need context to find all statements, meaning that you can't do this with a regular expression.
SQL 非常复杂,您将需要上下文来查找所有语句,这意味着您无法使用正则表达式来完成此操作。
For example:
例如:
SELECT Model FROM Product
WHERE ManufacturerID IN (SELECT ManufacturerID FROM Manufacturer
WHERE Manufacturer = 'Dell')
(example comes from http://www.sql-tutorial.com/sql-nested-queries-sql-tutorial/). Nested queries can be nested multiple times, start with different values, etc. Ifyou could write a regular expression for the subset you are interested in, it would be unreadable.
(示例来自http://www.sql-tutorial.com/sql-nested-queries-sql-tutorial/)。嵌套查询可以嵌套多次,以不同的值开始等。如果您可以为您感兴趣的子集编写正则表达式,它将是不可读的。
ANTLRhas a SQL 2003 grammaravailable (I haven't tried it).
ANTLR有可用的SQL 2003 语法(我还没有尝试过)。
回答by Walls
(?m)^(UPDATE|SELECT|INSERT INTO).*;$
should work. This would extend the pattern to match over newlines. It should be able to loop through and find all your SQL.
(?m)^(UPDATE|SELECT|INSERT INTO).*;$
应该管用。这将扩展模式以匹配换行符。它应该能够遍历并找到您的所有 SQL。
Looking at the example you provided it will match your commands until the ;
. You can see the example used for testing here.
查看您提供的示例,它将匹配您的命令,直到;
. 您可以在此处查看用于测试的示例。
回答by Mzn
If you're dealing with a language, create a lexer that tokenizes your string. Use JFlex, which is a lexical analyzer generator. It generates a Java class that splits a string into tokens based on a grammar specified in a special file. Take the relevant grammar rules from this file.
如果您正在处理一种语言,请创建一个词法分析器来标记您的字符串。使用JFlex,它是一个词法分析器生成器。它生成一个 Java 类,该类根据特殊文件中指定的语法将字符串拆分为标记。从这个文件中获取相关的语法规则。
Parsing is a separate process than tokenization (or lexical analysis). You might want to use a parser generator, after lexical analysis, if lexical analysis is not enough.
解析是一个独立于标记化(或词法分析)的过程。如果词法分析不够,您可能希望在词法分析之后使用解析器生成器。