java UUID的java正则表达式

Question

提问by Aqura

I want to parse a String which has UUID in the below format

我想解析一个具有以下格式的 UUID 的字符串

"&lt;urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce&gt;"

I have tried it parsing in below way, which works, however I think it would be slow

我已经尝试以下面的方式解析它，这有效，但是我认为它会很慢

private static final String reg1 = ".*?";
private static final String reg2 = "([A-Z0-9]{8}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{4}-[A-Z0-9]{12})";
private static final Pattern splitter = Pattern.compile(re1 + re2, Pattern.CASE_INSENSITIVE | Pattern.DOTALL);

I am looking for a faster way and tried below, but it fails to match

我正在寻找一种更快的方法并在下面尝试过，但它不匹配

private static final Pattern URN_UUID_PATTERN = Pattern.compile("^< urn:uuid:([^&])+&gt");

I am new to regex. any help is appreciated.

我是正则表达式的新手。任何帮助表示赞赏。

\Aqura

\阿库拉

Answer 1

回答by dlamblin

Your example of a faster regex is using a <where the input is <so that's confusing.

您的更快正则表达式示例使用的<是输入位置，<因此令人困惑。

Regarding speed, first, your UUID is hexadecimal, so don't match with A-Zbut rather a-f. Second you give no indication that case is mixed, so don't use case insensitive and write the correct case in the range.

关于速度，首先，你的 UUID 是十六进制的，所以不要匹配A-Z而是a-f. 其次，您没有给出大小写混合的迹象，因此不要使用不区分大小写的方式并在范围内写入正确的大小写。

You don't explain if you need the part preceding the UUID. If not, don't include .*?, and you may as well write the literals for re1and re2together in your final Pattern. There's no indication you need DOTALL either.

您没有解释是否需要 UUID 前面的部分。如果不是，不包括.*?，你可能也写的文字re1和re2你在一起final Pattern。也没有迹象表明您需要 DOTALL。

private static final Pattern splitter =
  Pattern.compile("([a-f0-9]{8}(-[a-f0-9]{4}){4}[a-f0-9]{8})");

Alternatively, if you are measuring your Regular Expression's performance to be too slow, you might try another approach, for example:
Is each uuid preceded by "uuid:" as in your example? If so you can

或者，如果您测量正则表达式的性能太慢，您可以尝试另一种方法，例如：
每个 uuid 前面是否与示例中的“uuid:”一样？如果是这样你可以

find the first index of "uuid:" as i, then
substring 0 to i+5 [assuming you needed it at all], and
substring i+5 to i+41, if I counted that right (36 characters in length).

找到 "uuid:" 的第一个索引作为i，然后
子串 0 到i+5 [假设你需要它]，和
子字符串i+5 到i+41，如果我算对了（长度为 36 个字符）。

Along similar lines your faster regex could be:

沿着类似的路线，您更快的正则表达式可能是：

private static final Pattern URN_UUID_PATTERN =
    Pattern.compile("^&lt;urn:uuid:(.{36})&gt;");

OTOH if all your input strings are going to start with those exact characters, no need to do step 1 in the previous suggestion, just input.substring(13, 49);

OTOH 如果您的所有输入字符串都以这些确切字符开头，则无需执行上一个建议中的第 1 步，只需 input.substring(13, 49);

Answer 2

回答by Alexander du Sautoy

If this format don't be changed. I think more fast way is use String.substring() method. Example:

如果这种格式不改变。我认为更快捷的方法是使用 String.substring() 方法。例子：

String val = "&lt;urn:uuid:4324e9d5-8d1f-442c-96a4-6146640da7ce&gt;";
String sUuid = val.substring(13, 49);
UUID uuid =  UUID.fromString(sUuid);

Inside class String used char array for store data, in package java.lang.String:

在类 String 内部使用字符数组存储数据，在包 java.lang.String 中：

public final class String
    implements java.io.Serializable, Comparable<String>, CharSequence {
...
113: /** The value is used for character storage. */
114: private final char value[];
...
}

Method 'String substring(int beginIndex, int endIndex)' make the copy of array elements, from start to end index, and create new String on basis new array. Copying of array it is a very fast operation.

方法 'String substring(int beginIndex, int endIndex)' 复制数组元素，从开始到结束索引，并在新数组的基础上创建新字符串。复制数组是一个非常快的操作。

java UUID的java正则表达式

提问by Aqura

回答by dlamblin

回答by Alexander du Sautoy

相关推荐

最近更新

标签

java UUID的java正则表达式

提问by Aqura

回答by dlamblin

回答by Alexander du Sautoy

相关推荐

java 如何在全局范围内使用 spring 配置 jackson？

java 使用特定模式将 Instant 格式化为字符串

java apt安装无法找到可执行文件

java 将自定义弹出/视图添加到 android 中的活动

相关推荐

最近更新

标签