Java中的正则表达式– Java Regex示例-IGI

时间：2020-02-23 14:41:43 　来源:igfitidea点击:

欢迎使用Java正则表达式。
在Java中也称为Regex。
当我开始编程时，Java正则表达式对我来说是一场噩梦。
本教程旨在帮助您掌握Java中的正则表达式。
我还将回到这里来刷新我的Java Regex学习。

Java正则表达式

Java中的正则表达式为String定义了一个模式。
正则表达式可用于搜索，编辑或者处理文本。
正则表达式不是特定于语言的，但是每种语言的正则表达式略有不同。
Java中的正则表达式与Perl最相似。

Java Regex类存在于java.util.regex软件包中，该软件包包含三个类：

模式：Pattern对象是正则表达式的编译版本。
模式类没有任何公共构造函数，我们使用它的公共静态方法compile通过传递正则表达式参数来创建模式对象。
Matcher：Matcher是Java regex引擎对象，它将输入的String模式与创建的pattern对象进行匹配。
Matcher类没有任何公共构造函数，我们使用模式对象matcher方法获得一个Matcher对象，该方法以输入String作为参数。
然后，我们使用" matches"方法，该方法根据输入的String是否匹配正则表达式模式返回布尔结果。
PatternSyntaxException：如果正则表达式语法不正确，则会抛出" PatternSyntaxException"。

让我们看一下Java Regex示例程序。

package com.theitroad.util;

import java.util.regex.*;

public class PatternExample {

	public static void main(String[] args) {
		Pattern pattern = Pattern.compile(".xx.");
		Matcher matcher = pattern.matcher("MxxY");
		System.out.println("Input String matches regex - "+matcher.matches());
		//bad regular expression
		pattern = Pattern.compile("*xx*");

	}

}

当我们运行此Java regex示例程序时，得到以下输出。

Input String matches regex - true
Exception in thread "main" java.util.regex.PatternSyntaxException: Dangling meta character '*' near index 0
*xx*
^
	at java.util.regex.Pattern.error(Pattern.java:1924)
	at java.util.regex.Pattern.sequence(Pattern.java:2090)
	at java.util.regex.Pattern.expr(Pattern.java:1964)
	at java.util.regex.Pattern.compile(Pattern.java:1665)
	at java.util.regex.Pattern.(Pattern.java:1337)
	at java.util.regex.Pattern.compile(Pattern.java:1022)
	at com.theitroad.util.PatternExample.main(PatternExample.java:13)

由于Java正则表达式围绕String展开，因此Java 1.4中对String类进行了扩展，以提供一种进行正则表达式模式匹配的" matches"方法。
在内部，它使用Pattern和Matcherjava regex类进行处理，但是显然减少了代码行。

Pattern类还包含matches方法，该方法将正则表达式和输入String作为参数，并在匹配它们后返回布尔结果。

因此，以下代码可以很好地将输入String与Java中的正则表达式进行匹配。

String str = "bbb";
System.out.println("Using String matches method: "+str.matches(".bb"));
System.out.println("Using Pattern matches method: "+Pattern.matches(".bb", str));

因此，如果您只需要检查输入的String是否与模式匹配，则应使用简单的String Match方法来节省时间和代码行。

仅在需要操纵输入String或者需要重用模式时，才应使用Pattern和Matches类。

请注意，由正则表达式定义的模式从左到右应用于字符串，并且一旦在匹配项中使用了源字符，就无法重复使用。

例如，正则表达式" 121"将匹配" 31212142121"的次数是" _121____121"的两倍。

Java Regex元字符

Java正则表达式中有一些元字符，就像常见匹配模式的短代码一样。

Regular Expression	Description
\d	Any digits, short of [0-9]
\D	Any non-digit, short for [^0-9]
\s	Any whitespace character, short for [\n\x0B\f\r]
\S	Any non-whitespace character, short for [^\s]
\w	Any word character, short for [a-zA-Z_0-9]
\W	Any non-word character, short for [^\w]
\b	A word boundary
\B	A non word boundary

在正则表达式中有两种方法可以将元字符用作普通字符。

在元字符前加反斜杠()。
保持元字符在\ Q(以引号开头)和\ E(以引号结尾)之内。

Java中的正则表达式–量词

Java Regex量词指定要匹配的字符的出现次数。

Regular Expression	Description
x?	x occurs once or not at all
X*	X occurs zero or more times
X+	X occurs one or more times
X{n}	X occurs exactly n times
X{n,}	X occurs n or more times
X{n,m}	X occurs at least n times but not more than m times

Java Regex量词也可以与字符类和捕获组一起使用。

例如，[abc] +表示– a，b或者c –一次或者多次。

(abc)+表示" abc"组再出现一次。
我们现在将讨论捕获组。

Java中的正则表达式–捕获组

Java捕获组中的正则表达式用于将多个字符视为一个单元。
您可以使用()创建一个组。
输入字符串中与捕获组匹配的部分被保存到内存中，可以使用Backreference进行调用。

您可以使用matcher.groupCount方法找出Java正则表达式模式中捕获组的数量。
例如，((a)(bc))包含3个捕获组-((a)(bc))，(a)和(bc)。

您可以在正则表达式中使用反引号(带反斜杠())，然后再调用要调用的组的编号。

捕获组和反向引用可能会造成混淆，因此让我们以一个示例来了解这一点。

System.out.println(Pattern.matches("(\w\d)", "a2a2")); //true
System.out.println(Pattern.matches("(\w\d)", "a2b2")); //false
System.out.println(Pattern.matches("(AB)(B\d)", "ABB2B2AB")); //true
System.out.println(Pattern.matches("(AB)(B\d)", "ABB2B3AB")); //false

在第一个示例中，在运行时，第一个捕获组是(\ w \ d)，当与输入字符串" a2a2"匹配并保存在内存中时，其计算结果为" a2"。
因此\ 1指的是" a2"，因此它返回true。
由于相同的原因，第二条语句打印为false。

现在，我们将研究一些重要的Pattern和Matcher类方法。

我们可以创建带有标志的Pattern对象。
例如，Pattern.CASE_INSENSITIVE启用不区分大小写的匹配。
Pattern类还提供了split(String)方法，该方法类似于String类split()方法。
模式类的toString()方法返回正则表达式String，该模式是从该正则表达式编译而成的。
Matcher类具有" start()"和" end()"索引方法，可精确显示在输入字符串中找到匹配项的位置。
Matcher类还提供String操纵方法replaceAll(字符串替换)和replaceFirst(字符串替换)。

让我们在一个简单的示例程序中查看这些Java regex方法。

package com.theitroad.util;

import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class RegexExamples {

	public static void main(String[] args) {
		//using pattern with flags
		Pattern pattern = Pattern.compile("ab", Pattern.CASE_INSENSITIVE);
		Matcher matcher = pattern.matcher("ABcabdAb");
		//using Matcher find(), group(), start() and end() methods
		while (matcher.find()) {
			System.out.println("Found the text \"" + matcher.group()
					+ "\" starting at " + matcher.start()
					+ " index and ending at index " + matcher.end());
		}

		//using Pattern split() method
		pattern = Pattern.compile("\W");
		String[] words = pattern.split("one@two#three:four$five");
		for (String s : words) {
			System.out.println("Split using Pattern.split(): " + s);
		}

		//using Matcher.replaceFirst() and replaceAll() methods
		pattern = Pattern.compile("1*2");
		matcher = pattern.matcher("11234512678");
		System.out.println("Using replaceAll: " + matcher.replaceAll("_"));
		System.out.println("Using replaceFirst: " + matcher.replaceFirst("_"));
	}

}

上面的java regex示例程序的输出是。

Found the text "AB" starting at 0 index and ending at index 2
Found the text "ab" starting at 3 index and ending at index 5
Found the text "Ab" starting at 6 index and ending at index 8
Split using Pattern.split(): one
Split using Pattern.split(): two
Split using Pattern.split(): three
Split using Pattern.split(): four
Split using Pattern.split(): five
Using replaceAll: _345_678
Using replaceFirst: _34512678

Java中的正则表达式– Java Regex示例

Java正则表达式

Java Regex元字符

Java中的正则表达式–量词

Java中的正则表达式–捕获组

相关推荐

最近更新

标签

Java中的正则表达式– Java Regex示例

Java正则表达式

Java Regex元字符

Java中的正则表达式–量词

Java中的正则表达式–捕获组

相关推荐

Primefaces日历组件示例教程

Primefaces命令按钮

Primefaces CommandLink

Primefaces仪表板组件示例教程

相关推荐

最近更新

标签