Java 从没有破折号的字符串创建 UUID

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18986712/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 13:06:31  来源:igfitidea点击:

Creating a UUID from a string with no dashes

javastringclojureuuid

提问by yayitswei

How would I create a java.util.UUID from a string with no dashes?

如何从没有破折号的字符串创建 java.util.UUID?

"5231b533ba17478798a3f2df37de2aD7" => #uuid "5231b533-ba17-4787-98a3-f2df37de2aD7"

采纳答案by Jared314

Clojure's #uuidtagged literalis a pass-through to java.util.UUID/fromString. And, fromStringsplits it by the "-" and converts it into two Longvalues. (The format for UUIDis standardized to 8-4-4-4-12 hex digits, but the "-" are really only there for validation and visual identification.)

Clojure 的#uuid标记文字是对java.util.UUID/fromString. 并且,fromString用“-”分割它并将其转换为两个Long值。(UUID的格式被标准化为 8-4-4-4-12 十六进制数字,但“-”实际上仅用于验证和视觉识别。)

The straight forward solution is to reinsert the "-" and use java.util.UUID/fromString.

直接的解决方案是重新插入“-”并使用java.util.UUID/fromString.

(defn uuid-from-string [data]
  (java.util.UUID/fromString
   (clojure.string/replace data
                           #"(\w{8})(\w{4})(\w{4})(\w{4})(\w{12})"
                           "----")))

If you want something without regular expressions, you can use a ByteBufferand DatatypeConverter.

如果你想要没有正则表达式的东西,你可以使用 a ByteBufferand DatatypeConverter

(defn uuid-from-string [data]
  (let [buffer (java.nio.ByteBuffer/wrap 
                 (javax.xml.bind.DatatypeConverter/parseHexBinary data))]
    (java.util.UUID. (.getLong buffer) (.getLong buffer))))

回答by maerics

You could do a goofy regular expression replacement:

你可以做一个愚蠢的正则表达式替换:

String digits = "5231b533ba17478798a3f2df37de2aD7";                         
String uuid = digits.replaceAll(                                            
    "(\w{8})(\w{4})(\w{4})(\w{4})(\w{12})",                            
    "----");                                                      
System.out.println(uuid); // => 5231b533-ba17-4787-98a3-f2df37de2aD7

回答by Basil Bourque

tl;dr

tl;博士

java.util.UUID.fromString(
    "5231b533ba17478798a3f2df37de2aD7"
    .replaceFirst( 
        "(\p{XDigit}{8})(\p{XDigit}{4})(\p{XDigit}{4})(\p{XDigit}{4})(\p{XDigit}+)", "----" 
    )
).toString()

5231b533-ba17-4787-98a3-f2df37de2ad7

5231b533-ba17-4787-98a3-f2df37de2ad7

Bits, Not Text

位,而不是文本

A UUIDis a 128-bit value. A UUID is notactually made up of letters and digits, it is made up of bits. You can think of it as describing a very, very large number.

UUID是一个128位的值。UUID是实际的字母和数字组成,它是由比特组成。您可以将其视为描述一个非常非常大的数字。

We could display those bits as a one hundred and twenty eight 0& 1characters.

我们可以将这些位显示为 128 个0&1字符。

0111 0100 1101 0010 0101 0001 0101 0110 0110 0000 1110 0110 0100 0100 0100 1100 1010 0001 0111 0111 1010 1001 0110 1110 0110 0111 1110 1100 1111 1100 0101 1111

0111 0100 1101 0010 0101 0001 0101 0110 0110 0000 1110 0110 0100 0100 0100 1100 1010 0001 0111 0111 1010 1001 0110 1110 0110 0111 1110 1100 1111 1100 0101 1111

Humans do not easily read bits, so for convenience we usually represent the 128-bit value as a hexadecimalstring made up of letters and digits.

人类不容易读取位,因此为了方便起见,我们通常将 128 位值表示为由字母和数字组成的十六进制字符串。

74d25156-60e6-444c-a177-a96e67ecfc5f

74d25156-60e6-444c-a177-a96e67ecfc5f

Such a hex string is not the UUID itself, only a human-friendly representation. The hyphens are added per the UUID spec as canonical formatting, but are optional.

这样的十六进制字符串不是 UUID 本身,只是一种人性化的表示。连字符是根据 UUID 规范作为规范格式添加的,但它是可选的。

74d2515660e6444ca177a96e67ecfc5f

74d2515660e6444ca177a96e67ecfc5f

By the way, the UUID spec clearly states that lowercaseletters must be used when generating the hex string while uppercase should be tolerated as input. Unfortunately, many implementations violate that lowercase-generation rule, including those from Apple, Microsoft, and others. See my blog post.

顺便说一下,UUID 规范明确规定在生成十六进制字符串时必须使用小写字母,而应允许使用大写字母作为输入。不幸的是,许多实现违反了小写生成规则,包括来自 Apple、Microsoft 和其他公司的实现。请参阅我的博客文章



The following refers to Java, not Clojure.

以下是指 Java,而不是 Clojure。

In Java 7 (and earlier), you may use the java.util.UUIDclass to instantiate a UUID based on a hex string with hyphens as input. Example:

在 Java 7(及更早版本)中,您可以使用java.util.UUID类根据以连字符作为输入的十六进制字符串实例化 UUID。例子:

java.util.UUID uuidFromHyphens = java.util.UUID.fromString("6f34f25e-0b0d-4426-8ece-a8b3f27f4b63");
System.out.println( "UUID from string with hyphens: " + uuidFromHyphens );

However, that UUID class fails with inputting a hex string withouthyphens. This failure is unfortunate as the UUID spec does notrequire the hyphens in a hex string representation. This fails:

但是,该 UUID 类因输入不带连字符的十六进制字符串失败。这种故障是不幸的,因为该UUID规范并没有要求在十六进制字符串表示的连字符。这失败了:

java.util.UUID uuidFromNoHyphens = java.util.UUID.fromString("6f34f25e0b0d44268ecea8b3f27f4b63");

Regex

正则表达式

One workaround is to format the hex string to add the canonical hyphens. Here's my attempt at using regex to format the hex string. Beware… This code works, but I'm no regex expert. You should make this code more robust, say checking that the length of the string is 32 characters before formatting and 36 after.

一种解决方法是格式化十六进制字符串以添加规范连字符。这是我使用正则表达式格式化十六进制字符串的尝试。当心......这段代码有效,但我不是正则表达式专家。你应该让这段代码更健壮,比如检查字符串的长度在格式化之前是 32 个字符,之后是 36 个字符。

    // -----|  With Hyphens  |----------------------
java.util.UUID uuidFromHyphens = java.util.UUID.fromString( "6f34f25e-0b0d-4426-8ece-a8b3f27f4b63" );
System.out.println( "UUID from string with hyphens: " + uuidFromHyphens );
System.out.println();

// -----|  Without Hyphens  |----------------------
String hexStringWithoutHyphens = "6f34f25e0b0d44268ecea8b3f27f4b63";
// Use regex to format the hex string by inserting hyphens in the canonical format: 8-4-4-4-12
String hexStringWithInsertedHyphens =  hexStringWithoutHyphens.replaceFirst( "([0-9a-fA-F]{8})([0-9a-fA-F]{4})([0-9a-fA-F]{4})([0-9a-fA-F]{4})([0-9a-fA-F]+)", "----" );
System.out.println( "hexStringWithInsertedHyphens: " + hexStringWithInsertedHyphens );
java.util.UUID myUuid = java.util.UUID.fromString( hexStringWithInsertedHyphens );
System.out.println( "myUuid: " + myUuid );

Posix Notation

Posix 符号

You might find this alternative syntax more readable, using Posix notation within the regex where \\p{XDigit}takes the place of [0-9a-fA-F](see Patterndoc):

您可能会发现这种替代语法更具可读性,使用正则表达式中的 Posix 符号\\p{XDigit}代替[0-9a-fA-F](参见模式文档):

String hexStringWithInsertedHyphens =  hexStringWithoutHyphens.replaceFirst( "(\p{XDigit}{8})(\p{XDigit}{4})(\p{XDigit}{4})(\p{XDigit}{4})(\p{XDigit}+)", "----" );

Complete example.

完整的例子。

java.util.UUID uuid =
        java.util.UUID.fromString (
                "5231b533ba17478798a3f2df37de2aD7"
                        .replaceFirst (
                                "(\p{XDigit}{8})(\p{XDigit}{4})(\p{XDigit}{4})(\p{XDigit}{4})(\p{XDigit}+)",
                                "----"
                        )
        );

System.out.println ( "uuid.toString(): " + uuid );

uuid.toString(): 5231b533-ba17-4787-98a3-f2df37de2ad7

uuid.toString(): 5231b533-ba17-4787-98a3-f2df37de2ad7

回答by Pawe? Wo?niak

Regexp solution is probably faster, but you can also look at that :)

正则表达式解决方案可能更快,但你也可以看看:)

String withoutDashes = "44e128a5-ac7a-4c9a-be4c-224b6bf81b20".replaceAll("-", "");      
BigInteger bi1 = new BigInteger(withoutDashes.substring(0, 16), 16);                
BigInteger bi2 = new BigInteger(withoutDashes.substring(16, 32), 16);
UUID uuid = new UUID(bi1.longValue(), bi2.longValue());
String withDashes = uuid.toString();

By the way, conversion from 16 binary bytes to uuid

顺便说一下,从16个二进制字节到uuid的转换

  InputStream is = ..binarty input..;
  byte[] bytes = IOUtils.toByteArray(is);
  ByteBuffer bb = ByteBuffer.wrap(bytes);
  UUID uuidWithDashesObj = new UUID(bb.getLong(), bb.getLong());
  String uuidWithDashes = uuidWithDashesObj.toString();

回答by Brad Knox

public static String addUUIDDashes(String idNoDashes) {
    StringBuffer idBuff = new StringBuffer(idNoDashes);
    idBuff.insert(20, '-');
    idBuff.insert(16, '-');
    idBuff.insert(12, '-');
    idBuff.insert(8, '-');
    return idBuff.toString();
}

Maybe someone else can comment on the computational efficiency of this approach. (It wasn't a concern for my application.)

也许其他人可以评论这种方法的计算效率。(这不是我的申请的问题。)

回答by toxi

A much (~ 900%) faster solution compared to using regexps and string manipulation is to just parse the hex string into 2 longs and create the UUID instance from those:

与使用正则表达式和字符串操作相比,一个更快(约 900%)的解决方案是将十六进制字符串解析为 2 个 long 并从中创建 UUID 实例:

(defn uuid-from-string
  "Converts a 32digit hex string into java.util.UUID"
  [hex]
  (java.util.UUID.
    (Long/parseUnsignedLong (subs hex 0 16) 16)
    (Long/parseUnsignedLong (subs hex 16) 16)))

回答by foozbar

Another solution would be something similar to Pawel's solution but without creating new Strings and only solving the questions problem. If perfomance is a concern, avoid regex/split/replaceAll and UUID.fromString like the plague.

另一种解决方案类似于 Pawel 的解决方案,但不创建新字符串,仅解决问题。如果性能是一个问题,请避免像瘟疫一样使用 regex/split/replaceAll 和 UUID.fromString。

String hyphenlessUuid = in.nextString();
BigInteger bigInteger = new BigInteger(hyphenlessUuid, 16);
 new UUID(bigInteger.shiftRight(64).longValue(), bigInteger.longValue());

回答by Adam Gent

I believe the following is the fastest in terms of performance. It is even slightly faster than Long.parseUnsignedLong version. It is slightly altered code that comes from java-uuid-generator.

我相信以下是性能方面最快的。它甚至比Long.parseUnsignedLong version略快。它是来自java-uuid-generator 的略有改动的代码。

 public static UUID from32(
        String id) {
    if (id == null) {
        throw new NullPointerException();
    }
    if (id.length() != 32) {
        throw new NumberFormatException("UUID has to be 32 char with no hyphens");
    }

    long lo, hi;
    lo = hi = 0;

    for (int i = 0, j = 0; i < 32; ++j) {
        int curr;
        char c = id.charAt(i);

        if (c >= '0' && c <= '9') {
            curr = (c - '0');
        }
        else if (c >= 'a' && c <= 'f') {
            curr = (c - 'a' + 10);
        }
        else if (c >= 'A' && c <= 'F') {
            curr = (c - 'A' + 10);
        }
        else {
            throw new NumberFormatException(
                    "Non-hex character at #" + i + ": '" + c + "' (value 0x" + Integer.toHexString(c) + ")");
        }
        curr = (curr << 4);

        c = id.charAt(++i);

        if (c >= '0' && c <= '9') {
            curr |= (c - '0');
        }
        else if (c >= 'a' && c <= 'f') {
            curr |= (c - 'a' + 10);
        }
        else if (c >= 'A' && c <= 'F') {
            curr |= (c - 'A' + 10);
        }
        else {
            throw new NumberFormatException(
                    "Non-hex character at #" + i + ": '" + c + "' (value 0x" + Integer.toHexString(c) + ")");
        }
        if (j < 8) {
            hi = (hi << 8) | curr;
        }
        else {
            lo = (lo << 8) | curr;
        }
        ++i;
    }
    return new UUID(hi, lo);
}

回答by vahapt

Optimized version of @maerics's answer:

@maerics答案的优化版本:

    String[] digitsList= {
            "daa70a7ffa904841bf9a81a67bdfdb45",
            "529737c950e6428f80c0bac104668b54",
            "5673c26e2e8f4c129906c74ec634b807",
            "dd5a5ee3a3c44e4fb53d2e947eceeda5",
            "faacc25d264d4e9498ade7a994dc612e",
            "9a1d322dc70349c996dc1d5b76b44a0a",
            "5fcfa683af5148a99c1bd900f57ea69c",
            "fd9eae8272394dfd8fd42d2bc2933579",
            "4b14d571dd4a4c9690796da318fc0c3a",
            "d0c88286f24147f4a5d38e6198ee2d18"
    };

    //Use compiled pattern to improve performance of bulk operations
    Pattern pattern = Pattern.compile("(\w{8})(\w{4})(\w{4})(\w{4})(\w{12})");

    for (int i = 0; i < digitsList.length; i++)
    {
        String uuid = pattern.matcher(digitsList[i]).replaceAll("----");
        System.out.println(uuid);
    }