java 让正则表达式忽略新行并匹配整个大字符串？

Question

提问by Zombies

I have this string here:

我这里有这个字符串：

CREATE UNIQUE INDEX index555 ON
SOME_TABLE
(
    SOME_PK          ASC
);

I want to match across the multiple lines and match the SQL statements (all of them, there will be many in 1 large string)... something like this, however I am only getting a match on CREATE UNIQUE INDEX index555 ON

我想在多行中匹配并匹配 SQL 语句（所有这些，都会有很多在 1 个大字符串中）......像这样，但是我只得到匹配 CREATE UNIQUE INDEX index555 ON

(CREATE\s.+;)

note: I am trying to accomplish this in java if it matters.

注意：如果重要的话，我正在尝试在 Java 中完成此操作。

Answer 1

回答by

You need to use DOTALL and MULTILINE flags when compiling a regular expression. Here is a Java code example:

编译正则表达式时需要使用 DOTALL 和 MULTILINE 标志。这是一个 Java 代码示例：

import java.util.regex.*;

public class test
{
    public static void main(String[] args)
    {
        String s =
        "CREATE UNIQUE INDEX index555 ON\nSOME_TABLE\n(\n    SOME_PK          ASC\n);\nCREATE UNIQUE INDEX index666 ON\nOTHER_TABLE\n(\n    OTHER_PK          ASC\n);\n";

        Pattern p = Pattern.compile("([^;]*?('.*?')?)*?;\s*", Pattern.CASE_INSENSITIVE | Pattern.DOTALL | Pattern.MULTILINE);

        Matcher m = p.matcher(s);

        while (m.find())
        {
        System.out.println ("--- Statement ---");
        System.out.println (m.group ());
        }
    }
}

The output will be:

输出将是：

--- Statement ---
CREATE UNIQUE INDEX index555 ON
SOME_TABLE
(
    SOME_PK          ASC
);

--- Statement ---
CREATE UNIQUE INDEX index666 ON
OTHER_TABLE
(
    OTHER_PK          ASC
);

Answer 2

回答by lowercase

Check this

检查这个

The regular expression . matches any character except a line terminator unless the DOTALL flag is specified

正则表达式。匹配除行终止符以外的任何字符，除非指定了 DOTALL 标志

So you need to do something like this

所以你需要做这样的事情

Pattern p = Pattern.compile("your pattern", Pattern.DOTALL);

Answer 3

回答by Alan Moore

The DOTALLflag lets the .match newlines, but if you simply apply it to your existing regex, you'll end up matching everything from the first CREATEto the last ;in one go. If you want to match the statements individually, you'll need to do more. One option is to use a non-greedy quantifier:

该DOTALL标志允许.匹配换行符，但如果您只是将它应用于现有的正则表达式，您最终会一次性匹配从第一个CREATE到最后;一个的所有内容。如果您想单独匹配语句，则需要执行更多操作。一种选择是使用非贪婪量词：

Pattern p = Pattern.compile("^CREATE\b.+?;",
    Pattern.DOTALL | Pattern.MULTILINE | Pattern.CASE_INSENSITIVE);

I also used the MULTILINEflag to let the ^anchor match after newlines, and CASE_INSENSITIVEbecause SQL is--at least, every flavor I've heard of. Note that all three flags have "inline" forms that you can use in the regex itself:

我还使用MULTILINE标志让^锚在换行符之后匹配，CASE_INSENSITIVE因为 SQL 是——至少，我听说过的每一种风格。请注意，所有三个标志都有您可以在正则表达式本身中使用的“内联”形式：

Pattern p = Pattern.compile("(?smi)^CREATE\b.+?;");

(The inline form of DOTALLis sfor historical reasons; it was called "single-line" mode in Perl, where it originated.) Another option is to use a negated character class:

（内联形式DOTALL是s出于历史原因；它在 Perl 中被称为“单行”模式，起源于此。）另一种选择是使用否定字符类：

Pattern p = Pattern.compile("(?mi)^CREATE\b[^;]+;");

[^;]+matches one or more of any character except ;--that includes newlines, so the sflag isn't needed.

[^;]+匹配除;--that 包括换行符之外的任何字符中的一个或多个，因此s不需要该标志。

So far, I've assumed that every statement starts at the beginning of a line and ends with a semicolon, as in your example. I don't think either of those things is required by the SQL standard, but I expect you'll know if you can count on them in this instance. You might want to start matching at a word boundary instead of a line boundary:

到目前为止，我假设每个语句都从一行的开头开始并以分号结束，如您的示例所示。我认为 SQL 标准不需要这些东西中的任何一个，但我希望您会知道在这种情况下是否可以依靠它们。您可能希望在单词边界而不是行边界处开始匹配：

Pattern p = Pattern.compile("(?i)\bCREATE\b[^;]+;");

Finally, if you're thinking about doing anything more complicated with regexes and SQL, don't. Parsing SQL with regexes is a fool's game--it's an even worse fit than HTML and regexes.

最后，如果您正在考虑使用正则表达式和 SQL 做任何更复杂的事情，请不要. 用正则表达式解析 SQL 是一个傻瓜的游戏——它比 HTML 和正则表达式更适合。

Answer 4

回答by Don Kirkby

Check out the various flags that can be passed to Pattern.compile. I think DOTALL is the one you need.

查看可以传递给Pattern.compile的各种标志。我认为 DOTALL 是您需要的。

Answer 5

回答by Kibbee

You'll want to use the Pattern.DOTALLflag to match across lines.

您将需要使用Pattern.DOTALL标志来跨行匹配。

java 让正则表达式忽略新行并匹配整个大字符串？

提问by Zombies

回答by

回答by lowercase

回答by Alan Moore

回答by Don Kirkby

回答by Kibbee

相关推荐

最近更新

标签

java 让正则表达式忽略新行并匹配整个大字符串？

提问by Zombies

回答by

回答by lowercase

回答by Alan Moore

回答by Don Kirkby

回答by Kibbee

相关推荐

java 如何设置 JTable 列和行颜色？

java 流式结果集错误

java 如何比较包含相同字符的2个字符串

java JasperReports 没有正确读取参数？

相关推荐

最近更新

标签