database GUID 与 INT 身份
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/829284/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
GUID vs INT IDENTITY
提问by CodeMonkey1313
Possible Duplicate:
How do you like your primary keys?
可能的重复:
你喜欢你的主键吗?
I'm aware of the benefits of using a GUID, as well as the benefits of using and INT as a PK in a database. Considering that a GUID is in essence a 128 bit INT and a normal INT is 32 bit, the INT is a space saver (though this point is generally moot in most modern systems).
我知道使用 GUID 的好处,以及使用 INT 作为数据库中 PK 的好处。考虑到 GUID 本质上是 128 位 INT,而普通 INT 是 32 位,INT 可以节省空间(尽管这点在大多数现代系统中通常没有实际意义)。
In the end, in what circumstances would you see yourself using an INT as a PK versus a GUID?
最后,在什么情况下您会认为自己使用 INT 作为 PK 与 GUID?
采纳答案by Ronald Wildenberg
Kimberley Tripp (SQLSkills.com) has an articleon using GUID's as primary keys. She advices against it because of the unnecessary overhead.
Kimberley Tripp (SQLSkills.com) 有一篇关于使用 GUID 作为主键的文章。由于不必要的开销,她建议不要这样做。
回答by Nordes
To answer your question: In the end, in what circumstances would you see yourself using an INT as a PK versus a GUID?
回答您的问题:最后,您会在什么情况下看到自己使用 INT 作为 PK 与 GUID?
I would use a GUID if my system would have an online/offline version that inside the offline version you can save data and that data is transferred back to the server one day during a synch. That way, you are sure that you won't have the same key twice inside your database.
如果我的系统有一个在线/离线版本,您可以在离线版本中保存数据,并且该数据在一天同步期间传输回服务器,我将使用 GUID。这样,您就可以确定不会在数据库中两次使用相同的密钥。
回答by Pontus Gagge
Apart from being a poor choice when you need to synchronize several database instances, INT's have one drawback I haven't seen mentioned: inserts always occur at one end of the index tree. This increases lock contention when you have a table with a lot of movement (since the same index pages have to be modified by concurrent inserts, whereas GUID's will be inserted all over the index). The index may also have to be rebalanced more often if a B* tree or similar data structure is used.
除了在需要同步多个数据库实例时是一个糟糕的选择之外,INT 还有一个我没有提到的缺点:插入总是发生在索引树的一端。当您的表有大量移动时,这会增加锁争用(因为必须通过并发插入修改相同的索引页,而 GUID 将插入整个索引)。如果使用 B* 树或类似的数据结构,索引也可能需要更频繁地重新平衡。
Of course, int's are easier on the eye when doing manual queries and report construction, and space consumption may add up through FK usages.
当然,在进行手动查询和报告构建时,int 更容易让人眼前一亮,而且空间消耗可能会通过 FK 用法累加起来。
I'd be interested to see any measurements of how well e.g. SQL Server actually handles insert-heavy tables with IDENTITY PK's.
我很想看看有关 SQL Server 实际处理 IDENTITY PK 的大量插入表的效果的任何度量。
回答by User
We have Guids in our very complex enterprise software everywhere. Works smoothly.
我们在非常复杂的企业软件中到处都有指南。工作顺利。
I believe Guids are semantically more suitable to serve as identifiers. There is also no point in unnecessarily worrying about performance until you are faced with that problem. Beware premature optimization.
我相信 Guids 在语义上更适合作为标识符。在您遇到该问题之前,不必要地担心性能也是没有意义的。当心过早优化。
There is also an advantage with database migration of any sort. With Guids you will have no collisions. If you attempt to merge several DBs where ints are used for identity, you will have to replace their values. If these old values were used in urls, they will now be different following SEO hit.
任何类型的数据库迁移也有一个优势。使用 Guids,您将不会发生冲突。如果您尝试合并多个使用整数作为标识的 DB,则必须替换它们的值。如果在 url 中使用了这些旧值,它们现在将在 SEO 命中后有所不同。
回答by Michael Borgwardt
the INT is a space saver (though this point is generally moot in most modern systems).
INT 节省空间(尽管这点在大多数现代系统中通常没有实际意义)。
Not so. It may seem so at first glance, but note that the primary key of each table will be repeated multiple times throughout the database in indexes and as foreign key in other tables. And it will be involved in nearly any query containing its table - and very intensively when it's a foreign key used for a join.
不是这样。乍一看似乎如此,但请注意,每个表的主键将在整个数据库中的索引中重复多次,并作为其他表中的外键。并且它将参与几乎所有包含其表的查询 - 当它是用于连接的外键时会非常密集。
Furthermore, remember that modern CPUs are very, very fast, but RAM speeds have not kept up. Cache behaviour becomes therefore increasingly important. And the best way to get good cache behaviour is to have smaller data sets. So the seemingly irrelevant difference between 4 and 16 bytes may well result in a noticeable difference in speed. Not necessarily always - but it's something to consider.
此外,请记住,现代 CPU 非常非常快,但 RAM 速度并没有跟上。因此缓存行为变得越来越重要。获得良好缓存行为的最佳方法是拥有更小的数据集。因此,4 字节和 16 字节之间看似无关的差异很可能会导致明显的速度差异。不一定总是 - 但这是需要考虑的。
回答by kemiller2002
When comparing values such as Primary to Foreign key relationship, the INT will be faster. If the tables are indexed properly and the tables are small, you might not see much of a slow down, but you'd have to try it to be sure. INTs are also easier to read, and communicate with other people. It's a lot simpler to say, "Can you look at record 1234?" instead of "Can you look at record 031E9502-E283-4F87-9049-CE0E5C76B658?"
在比较诸如主键与外键关系之类的值时,INT 会更快。如果表的索引正确并且表很小,您可能不会看到明显的减速,但您必须尝试确定。INT 也更容易阅读,也更容易与其他人交流。说“你能看一下记录1234吗?”要简单得多。而不是“你能看看记录 031E9502-E283-4F87-9049-CE0E5C76B658吗?”
回答by Marco van de Voort
Some OSes don't generate GUIDs anymore based on unique hardware features (CPUID,MAC) because it made tracing users to easy (privacy concerns). This means the GUID uniqueness is often no longer as universal as many people think.
一些操作系统不再根据独特的硬件功能(CPUID、MAC)生成 GUID,因为它使跟踪用户变得容易(隐私问题)。这意味着 GUID 的唯一性通常不再像许多人想象的那样普遍。
If you use some auto-id function of your database, the database could in theory make absolutely sure that there is no duplication.
如果您使用数据库的某些自动识别功能,则数据库理论上可以绝对确保没有重复。
回答by Nicholas Piasecki
If the data lives in a single database (as most data for the applications that we write in general does), then I use an IDENTITY
. It's easy, intended to be used that way, doesn't fragment the clustered index and is more than enough. You'll run out of room at 2 billion some records (~ 4 billion if you use negative values), but you'd be toast anyway if you had that many records in one table, and then you have a data warehousing problem.
如果数据存在于单个数据库中(就像我们通常编写的应用程序的大多数数据一样),那么我使用IDENTITY
. 它很容易,旨在以这种方式使用,不会分割聚集索引并且绰绰有余。有些记录会用完 20 亿条记录(如果使用负值,大约为 40 亿条记录),但无论如何,如果一张表中有这么多记录,那么您将不得不吐槽,然后就会遇到数据仓库问题。
If the data lives in multiple, independent databases or interfaces with a third-party service, then I'll use the GUID
that was likely already generated. A good example would be a UserProfiles table in the database that maps users in Active Directory to their user profiles in the application via their objectGUID
that Active Directory assigned to them.
如果数据存在于多个独立的数据库或与第三方服务的接口中,那么我将使用GUID
可能已经生成的数据。一个很好的例子是数据库中的 UserProfiles 表,它通过objectGUID
分配给他们的 Active Directory 将Active Directory 中的用户映射到应用程序中的用户配置文件。
回答by Craig
If you are planning on merging database at some stage, ie for a multi-site replication type setup, Guid's will save a lot of pain. But other than that I find Int's easier.
如果您计划在某个阶段合并数据库,即对于多站点复制类型设置,Guid 将节省很多麻烦。但除此之外,我发现 Int 更容易。
回答by kevchadders
I always think PK's should be numeric where possble. Dont forget having GUIDs as a PK will probably mean that they are also used in other tables as foriegn keys, so paging and index etc will be greater.
我一直认为 PK 应该是可能的数字。不要忘记将 GUID 作为 PK 可能意味着它们在其他表中也用作外键,因此分页和索引等会更大。