database 您对使用 UUID 作为数据库行标识符有什么看法,尤其是在 Web 应用程序中?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5949/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 06:52:18  来源:igfitidea点击:

What's your opinion on using UUIDs as database row identifiers, particularly in web apps?

databaseweb-applicationsuuid

提问by yukondude

I've always preferred to use long integers as primary keys in databases, for simplicity and (assumed) speed. But when using a RESTor Rails-like URL scheme for object instances, I'd then end up with URLs like this:

为了简单和(假定)速度,我一直更喜欢使用长整数作为数据库中的主键。但是当对对象实例使用REST或类似 Rails 的 URL 方案时,我最终会得到这样的 URL:

http://example.com/user/783

And then the assumption is that there are also users with IDs of 782, 781, ..., 2, and 1. Assuming that the web app in question is secure enough to prevent people entering other numbers to view other users without authorization, a simple sequentially-assigned surrogate key also "leaks" the total number of instances (older than this one), in this case users, which might be privileged information. (For instance, I am user #726 in stackoverflow.)

然后假设还有用户的 ID 为 782、781、...、2 和 1。假设有问题的 Web 应用程序足够安全,可以防止人们输入其他号码以查看其他用户未经授权,一个简单的按顺序分配的代理键也会“泄漏”实例总数(比这个更旧),在这种情况下是用户,这可能是特权信息。(例如,我是 stackoverflow 中的用户 #726。)

Would a UUID/GUID be a better solution? Then I could set up URLs like this:

将一个UUID/ GUID是一个更好的解决方案吗?然后我可以设置这样的 URL:

http://example.com/user/035a46e0-6550-11dd-ad8b-0800200c9a66

Not exactly succinct, but there's less implied information about users on display. Sure, it smacks of "security through obscurity" which is no substitute for proper security, but it seems at least a little more secure.

不完全简洁,但显示的关于用户的隐含信息较少。当然,它带有“隐匿的安全性”的味道,这并不能替代适当的安全性,但它似乎至少更安全一些。

Is that benefit worth the cost and complexity of implementing UUIDs for web-addressable object instances? I think that I'd still want to use integer columns as database PKs just to speed up joins.

这种好处是否值得为 Web 可寻址对象实例实现 UUID 的成本和复杂性?我认为我仍然想使用整数列作为数据库 PK 来加速连接。

There's also the question of in-database representation of UUIDs. I know MySQL stores them as 36-character strings. Postgres seems to have a more efficient internal representation (128 bits?) but I haven't tried it myself. Anyone have any experience with this?

还有 UUID 的数据库内表示的问题。我知道 MySQL 将它们存储为 36 个字符的字符串。Postgres 似乎有更高效的内部表示(128 位?),但我自己还没有尝试过。有人对此有经验吗?



Update: for those who asked about just using the user name in the URL (e.g., http://example.com/user/yukondude), that works fine for object instances with names that are unique, but what about the zillions of web app objects that can really only be identified by number? Orders, transactions, invoices, duplicate image names, stackoverflow questions, ...

更新:对于那些询问只使用 URL 中的用户名的人(例如,http://example.com/user/yukondude),这适用于名称唯一的对象实例,但是对于无数的网络呢?真的只能通过数字识别的应用程序对象?订单、交易、发票、重复的图像名称、stackoverflow 问题……

采纳答案by Douglas Tosi

I can't say about the web side of your question. But uuids are great for n-tier applications. PK generation can be decentralized: each client generates it's own pk without risk of collision. And the speed difference is generally small.

我不能说你问题的网络方面。但是 uuid 非常适合 n 层应用程序。PK 生成可以去中心化:每个客户端生成自己的 pk,没有冲突的风险。并且速度差异一般很小。

Make sure your database supports an efficient storage datatype (16 bytes, 128 bits). At the very least you can encode the uuid string in base64 and use char(22).

确保您的数据库支持高效的存储数据类型(16 字节、128 位)。至少您可以在 base64 中对 uuid 字符串进行编码并使用 char(22)。

I've used them extensively with Firebird and do recommend.

我已经在 Firebird 中广泛使用了它们,并推荐使用它们。

回答by Adam Tuttle

For what it's worth, I've seen a long running stored procedure (9+ seconds) drop to just a few hundred milliseconds of run time simply by switching from GUID primary keys to integers. That's not to say displayinga GUID is a bad idea, but as others have pointed out, joining on them, and indexing them, by definition, is not going to be anywhere near as fast as with integers.

就其价值而言,我已经看到一个长时间运行的存储过程(9 秒以上)只需从 GUID 主键切换到整数,运行时间就会减少到几百毫秒。这并不是说显示GUID 是一个坏主意,但正如其他人指出的那样,根据定义,加入它们并为它们编制索引不会像使用整数那样快。

回答by SQLMenace

I can answer you that in SQL server if you use a uniqueidentifier (GUID) datatype and use the NEWID() function to create values you will get horrible fragmentation because of page splits. The reason is that when using NEWID() the value generated is not sequential. SQL 2005 added the NEWSEQUANTIAL() function to remedy that

我可以回答您,在 SQL Server 中,如果您使用唯一标识符 (GUID) 数据类型并使用 NEWID() 函数创建值,您将由于页面拆分而获得可怕的碎片。原因是使用 NEWID() 时生成的值不是顺序的。SQL 2005 添加了 NEWSEQUANTIAL() 函数来解决这个问题

One way to still use GUID and int is to have a guid and an int in a table so that the guid maps to the int. the guid is used externally but the int internally in the DB

仍然使用 GUID 和 int 的一种方法是在表中有一个 guid 和一个 int,以便 guid 映射到 int。guid 在外部使用,但在 DB 内部使用 int

for example

例如

457180FB-C2EA-48DF-8BEF-458573DA1C10    1
9A70FF3C-B7DA-4593-93AE-4A8945943C8A    2

1 and 2 will be used in joins and the guids in the web app. This table will be pretty narrow and should be pretty fast to query

1 和 2 将用于连接和 Web 应用程序中的 guid。这个表会很窄,查询起来应该很快

回答by Jonathan Arkell

Why couple your primary key with your URI?

为什么要将主键与 URI 结合?

Why not have your URI key be human readable (or unguessable, depending on your needs), and your primary index integer based, that way you get the best of both worlds. A lot of blog software does that, where the exposed id of the entry is identified by a 'slug', and the numeric id is hidden away inside of the system.

为什么不让你的 URI 键是人类可读的(或不可猜测的,取决于你的需要),并基于你的主索引整数,这样你就可以两全其美。许多博客软件都是这样做的,其中条目的公开 id 由“slug”标识,而数字 id 隐藏在系统内部。

The added benefit here is that you now have a really nice URL structure, which is good for SEO. Obviously for a transaction this is not a good thing, but for something like stackoverflow, it is important (see URL up top...). Getting uniqueness isn't that difficult. If you are really concerned, store a hash of the slug inside a table somewhere, and do a lookup before insertion.

这里的额外好处是您现在拥有一个非常好的 URL 结构,这对 SEO 有好处。显然,对于交易来说,这不是一件好事,但对于像 stackoverflow 这样的东西,这很重要(请参阅上面的 URL...)。获得独特性并不难。如果您真的很担心,请将 slug 的散列存储在某个表中,并在插入前进行查找。

edit:Stackoverflow doesn't quite use the system I describe, see Guy's comment below.

编辑:Stackoverflow 并没有完全使用我描述的系统,请参阅下面的 Guy 评论。

回答by Marius

We use GUIDs as primary keys for all our tables as it doubles as the RowGUID for MS SQL Server Replication. Makes it very easy when the client suddenly opens an office in another part of the world...

我们使用 GUID 作为所有表的主键,因为它兼作 MS SQL Server 复制的 RowGUID。当客户突然在世界的另一个地方开设办事处时,这很容易...

回答by Josh

Rather than URLs like this:

而不是这样的网址:

http://example.com/user/783

Why not have:

为什么没有:

http://example.com/user/yukondude

Which is friendlier to humans and doesn't leak that tiny bit of information?

哪个对人类更友好并且不会泄露那一点点信息?

回答by Andrea Bertani

You could use an integer which is related to the row number but is not sequential. For example, you could take the 32 bits of the sequential ID and rearrange them with a fixed scheme (for example, bit 1 becomes bit 6, bit 2 becomes bit 15, etc..).
This will be a bidirectional encryption, and you will be sure that two different IDs will always have different encryptions.
It would obviously be easy to decode, if one takes the time to generate enough IDs and get the schema, but, if I understand correctly your problem, you just want to not give away information too easily.

您可以使用与行号相关但不是顺序的整数。例如,您可以采用顺序 ID 的 32 位并使用固定方案重新排列它们(例如,位 1 变为位 6,位 2 变为位 15,等等)。
这将是一种双向加密,您将确保两个不同的 ID 将始终具有不同的加密。
如果您花时间生成足够的 ID 并获取架构,显然很容易解码,但是,如果我正确理解您的问题,您只是不想太容易泄露信息。

回答by Brian Lyttle

I don't think a GUID gives you many benefits. Users hate long, incomprehensible URLs.

我认为 GUID 不会给您带来很多好处。用户讨厌冗长、难以理解的 URL。

Create a shorter ID that you can map to the URL, or enforce a unique user name convention (http://example.com/user/brianly). The guys at 37Signalswould probably mock you for worrying about something like this when it comes to a web app.

创建一个可以映射到 URL 的较短 ID,或强制执行唯一的用户名约定 ( http://example.com/user/brianly)。在这些家伙37Signals公司可能会嘲笑你担心这样的事情,当涉及到一个web应用程序。

Incidentally you can force your database to start creating integer IDs from a base value.

顺便说一句,您可以强制您的数据库从基值开始创建整数 ID。

回答by Michael Barker

It also depends on what you care about for your application. For n-tier apps GUIDs/UUIDs are simpler to implement and are easier to port between different databases. To produce Integer keys some database support a sequence object natively and some require custom construction of a sequence table.

它还取决于您对应用程序的关注。对于 n 层应用程序,GUID/UUID 更易于实现并且更易于在不同数据库之间移植。为了生成整数键,一些数据库本身支持序列对象,而一些数据库需要自定义构建序列表。

Integer keys probably (I don't have numbers) provide an advantage for query and indexing performance as well as space usage. Direct DB querying is also much easier using numeric keys, less copy/paste as they are easier to remember.

整数键可能(我没有数字)为查询和索引性能以及空间使用提供了优势。使用数字键直接数据库查询也更容易,复制/粘贴更少,因为它们更容易记住。

回答by Daniel Alexiuc

I've tried both in real web apps.

我在真正的网络应用程序中都尝试过。

My opinion is that it is preferable to use integers and have short, comprehensible urls.

我的观点是最好使用整数并使用简短易懂的网址。

As a developer, it feels a little bit awful seeing sequential integers and knowing that some information about total record count is leaking out, but honestly - most people probably don't care, and that information has never really been critical to my businesses.

作为一名开发人员,看到连续整数并知道有关总记录数的一些信息正在泄露,感觉有点糟糕,但老实说 - 大多数人可能并不关心,而且这些信息对我的业务从来没有真正重要过。

Having long ugly UUID urls seems to me like much more of a turn off to normal users.

在我看来,拥有长而难看的 UUID url 更像是对普通用户的关闭。