我应该如何在 MySQL 表中存储 GUID?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/412341/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 12:34:54  来源:igfitidea点击:

How should I store GUID in MySQL tables?

mysqlguiduuid

提问by CDR

Do I use varchar(36) or are there any better ways to do it?

我是使用 varchar(36) 还是有更好的方法来做到这一点?

采纳答案by thaBadDawg

My DBA asked me when I asked about the best way to store GUIDs for my objects why I needed to store 16 bytes when I could do the same thing in 4 bytes with an Integer. Since he put that challenge out there to me I thought now was a good time to mention it. That being said...

当我问到为我的对象存储 GUID 的最佳方法时,我的 DBA 问我为什么我需要存储 16 个字节,而我可以用一个 Integer 以 4 个字节来做同样的事情。既然他向我提出了那个挑战,我认为现在是提它的好时机。话虽如此...

You can store a guid as a CHAR(16) binary if you want to make the most optimal use of storage space.

如果您想最优化地利用存储空间,您可以将 guid 存储为 CHAR(16) 二进制文件。

回答by Brian Fisher

I would store it as a char(36).

我会将它存储为字符(36)。

回答by KCD

Adding to the answer by ThaBadDawg, use these handy functions (thanks to a wiser collegue of mine) to get from 36 length string back to a byte array of 16.

添加到 ThaBadDawg 的答案,使用这些方便的函数(感谢我的一位更聪明的同事)从 36 长度的字符串返回到 16 的字节数组。

DELIMITER $$

CREATE FUNCTION `GuidToBinary`(
    $Data VARCHAR(36)
) RETURNS binary(16)
DETERMINISTIC
NO SQL
BEGIN
    DECLARE $Result BINARY(16) DEFAULT NULL;
    IF $Data IS NOT NULL THEN
        SET $Data = REPLACE($Data,'-','');
        SET $Result =
            CONCAT( UNHEX(SUBSTRING($Data,7,2)), UNHEX(SUBSTRING($Data,5,2)),
                    UNHEX(SUBSTRING($Data,3,2)), UNHEX(SUBSTRING($Data,1,2)),
                    UNHEX(SUBSTRING($Data,11,2)),UNHEX(SUBSTRING($Data,9,2)),
                    UNHEX(SUBSTRING($Data,15,2)),UNHEX(SUBSTRING($Data,13,2)),
                    UNHEX(SUBSTRING($Data,17,16)));
    END IF;
    RETURN $Result;
END

$$

CREATE FUNCTION `ToGuid`(
    $Data BINARY(16)
) RETURNS char(36) CHARSET utf8
DETERMINISTIC
NO SQL
BEGIN
    DECLARE $Result CHAR(36) DEFAULT NULL;
    IF $Data IS NOT NULL THEN
        SET $Result =
            CONCAT(
                HEX(SUBSTRING($Data,4,1)), HEX(SUBSTRING($Data,3,1)),
                HEX(SUBSTRING($Data,2,1)), HEX(SUBSTRING($Data,1,1)), '-', 
                HEX(SUBSTRING($Data,6,1)), HEX(SUBSTRING($Data,5,1)), '-',
                HEX(SUBSTRING($Data,8,1)), HEX(SUBSTRING($Data,7,1)), '-',
                HEX(SUBSTRING($Data,9,2)), '-', HEX(SUBSTRING($Data,11,6)));
    END IF;
    RETURN $Result;
END
$$

CHAR(16)is actually a BINARY(16), choose your preferred flavour

CHAR(16)实际上是一个BINARY(16),选择你喜欢的口味

To follow the code better, take the example given the digit-ordered GUID below. (Illegal characters are used for illustrative purposes - each place a unique character.) The functions will transform the byte ordering to achieve a bit order for superior index clustering. The reordered guid is shown below the example.

为了更好地遵循代码,请以下面给出的数字排序 GUID 为例。(非法字符用于说明目的 - 每个放置一个唯一的字符。)这些函数将转换字节顺序以实现高级索引集群的位顺序。重新排序的 guid 显示在示例下方。

12345678-9ABC-DEFG-HIJK-LMNOPQRSTUVW
78563412-BC9A-FGDE-HIJK-LMNOPQRSTUVW

Dashes removed:

删除了破折号:

123456789ABCDEFGHIJKLMNOPQRSTUVW
78563412BC9AFGDEHIJKLMNOPQRSTUVW

回答by Learning

char(36) would be a good choice. Also MySQL's UUID() function can be used which returns a 36-character text format (hex with hyphens) which can be used for retrievals of such IDs from the db.

char(36) 将是一个不错的选择。还可以使用 MySQL 的 UUID() 函数,它返回 36 个字符的文本格式(带连字符的十六进制),可用于从数据库中检索此类 ID。

回答by candu

"Better" depends on what you're optimizing for.

“更好”取决于您要优化的内容。

How much do you care about storage size/performance vs. ease of development? More importantly - are you generating enough GUIDs, or fetching them frequently enough, that it matters?

您在多大程度上关心存储大小/性能与开发的难易程度?更重要的是 - 您是否生成了足够多的 GUID,或者是否足够频繁地获取它们,这很重要吗?

If the answer is "no", char(36)is more than good enough, and it makes storing/fetching GUIDs dead-simple. Otherwise, binary(16)is reasonable, but you'll have to lean on MySQL and/or your programming language of choice to convert back and forth from the usual string representation.

如果答案是“否”,char(36)那就足够了,它使存储/获取 GUID 变得非常简单。否则,这binary(16)是合理的,但您必须依靠 MySQL 和/或您选择的编程语言来从通常的字符串表示形式来回转换。

回答by Onkar Janwa

Binary(16) would be fine, better than use of varchar(32).

Binary(16) 会很好,比使用 varchar(32) 更好。

回答by bigh_29

The GuidToBinary routine posted by KCD should be tweaked to account for the bit layout of the timestamp in the GUID string. If the string represents a version 1 UUID, like those returned by the uuid() mysql routine, then the time components are embedded in letters 1-G, excluding the D.

应调整 KCD 发布的 GuidToBinary 例程以考虑 GUID 字符串中时间戳的位布局。如果字符串表示版本 1 UUID,如 uuid() mysql 例程返回的那些,则时间组件嵌入在字母 1-G 中,不包括 D。

12345678-9ABC-DEFG-HIJK-LMNOPQRSTUVW
12345678 = least significant 4 bytes of the timestamp in big endian order
9ABC     = middle 2 timestamp bytes in big endian
D        = 1 to signify a version 1 UUID
EFG      = most significant 12 bits of the timestamp in big endian

When you convert to binary, the best order for indexing would be: EFG9ABC12345678D + the rest.

当您转换为二进制时,索引的最佳顺序是:EFG9ABC12345678D + 其余的。

You don't want to swap 12345678 to 78563412 because big endian already yields the best binary index byte order. However, you do want the most significant bytes moved in front of the lower bytes. Hence, EFG go first, followed by the middle bits and lower bits. Generate a dozen or so UUIDs with uuid() over the course of a minute and you should see how this order yields the correct rank.

您不想将 12345678 交换为 78563412,因为 big endian 已经产生了最佳的二进制索引字节顺序。但是,您确实希望将最重要的字节移到低字节之前。因此,EFG 先行,然后是中间位和低位。在一分钟内使用 uuid() 生成一打左右的 UUID,您应该会看到此顺序如何产生正确的排名。

select uuid(), 0
union 
select uuid(), sleep(.001)
union 
select uuid(), sleep(.010)
union 
select uuid(), sleep(.100)
union 
select uuid(), sleep(1)
union 
select uuid(), sleep(10)
union
select uuid(), 0;

/* output */
6eec5eb6-9755-11e4-b981-feb7b39d48d6
6eec5f10-9755-11e4-b981-feb7b39d48d6
6eec8ddc-9755-11e4-b981-feb7b39d48d6
6eee30d0-9755-11e4-b981-feb7b39d48d6
6efda038-9755-11e4-b981-feb7b39d48d6
6f9641bf-9755-11e4-b981-feb7b39d48d6
758c3e3e-9755-11e4-b981-feb7b39d48d6 

The first two UUIDs were generated closest in time. They only vary in the last 3 nibbles of the first block. These are the least significant bits of the timestamp, which means we want to push them to the right when we convert this to an indexable byte array. As a counter example, the last ID is the most current, but the KCD's swapping algorithm would put it before the 3rd ID (3e before dc, last bytes from the first block).

前两个 UUID 生成的时间最接近。它们仅在第一个块的最后 3 个半字节中有所不同。这些是时间戳的最低有效位,这意味着当我们将其转换为可索引的字节数组时,我们希望将它们向右推。作为反例,最后一个 ID 是最新的,但 KCD 的交换算法会将它放在第三个 ID 之前(dc 之前的 3e,第一个块的最后一个字节)。

The correct order for indexing would be:

正确的索引顺序是:

1e497556eec5eb6... 
1e497556eec5f10... 
1e497556eec8ddc... 
1e497556eee30d0... 
1e497556efda038... 
1e497556f9641bf... 
1e49755758c3e3e... 

See this article for supporting information: http://mysql.rjweb.org/doc.php/uuid

有关支持信息,请参阅本文:http: //mysql.rjweb.org/doc.php/uuid

*** note that I don't split the version nibble from the high 12 bits of the timestamp. This is the D nibble from your example. I just throw it in front. So my binary sequence ends up being DEFG9ABC and so on. This implies that all my indexed UUIDs start with the same nibble. The article does the same thing.

*** 请注意,我没有将版本半字节与时间戳的高 12 位分开。这是您示例中的 D 半字节。我只是把它扔在前面。所以我的二进制序列最终是 DEFG9ABC 等等。这意味着我所有的索引 UUID 都以相同的半字节开头。文章做同样的事情。

回答by sleepycal

For those just stumbling across this, there is now a much better alternative as per research by Percona.

对于那些刚刚遇到这个问题的人来说,根据 Percona 的研究,现在有一个更好的选择。

It consists of reorganising the UUID chunks for optimal indexing, then converting into binary for reduced storage.

它包括重新组织 UUID 块以获得最佳索引,然后转换为二进制以减少存储。

Read the full article here

此处阅读全文

回答by vsdev

I would suggest using the functions below since the ones mentioned by @bigh_29 transforms my guids into new ones (for reasons I don't understand). Also, these are a little bit faster in the tests I did on my tables. https://gist.github.com/damienb/159151

我建议使用下面的函数,因为@bigh_29 提到的那些将我的 guid 转换为新的(出于我不明白的原因)。此外,在我在我的桌子上进行的测试中,这些速度要快一些。https://gist.github.com/damienb/159151

DELIMITER |

CREATE FUNCTION uuid_from_bin(b BINARY(16))
RETURNS CHAR(36) DETERMINISTIC
BEGIN
  DECLARE hex CHAR(32);
  SET hex = HEX(b);
  RETURN LOWER(CONCAT(LEFT(hex, 8), '-', MID(hex, 9,4), '-', MID(hex, 13,4), '-', MID(hex, 17,4), '-', RIGHT(hex, 12)));
END
|

CREATE FUNCTION uuid_to_bin(s CHAR(36))
RETURNS BINARY(16) DETERMINISTIC
RETURN UNHEX(CONCAT(LEFT(s, 8), MID(s, 10, 4), MID(s, 15, 4), MID(s, 20, 4), RIGHT(s, 12)))
|

DELIMITER ;

回答by George Hazan

if you have a char/varchar value formatted as the standard GUID, you can simply store it as BINARY(16) using the simple CAST(MyString AS BINARY16), without all those mind-boggling sequences of CONCAT + SUBSTR.

如果您有一个格式化为标准 GUID 的 char/varchar 值,您可以使用简单的 CAST(MyString AS BINARY16) 将其存储为 BINARY(16),而无需所有那些令人难以置信的 CONCAT + SUBSTR 序列。

BINARY(16) fields are compared/sorted/indexed much faster than strings, and also take two times less space in the database

BINARY(16) 字段的比较/排序/索引比字符串快得多,并且在数据库中占用的空间也少两倍