SQL 何时使用自增主键,何时不使用?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2186260/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 05:15:58  来源:igfitidea点击:

When to use an auto-incremented primary key and when not to?

sqldatabase-design

提问by jamieb

I'm trying to figure out the "best practices" for deciding whether or not to add an auto-incrementing integer as the primary key to a table.

我试图找出决定是否将自动递增整数作为主键添加到表的“最佳实践”。

Let's say I have a table containing data about the chemical elements. The atomic number of each element is unique and will never change. So rather than using an auto-incrementing integer for each column, it would probably make more sense to just use the atomic number, correct?

假设我有一个包含有关化学元素数据的表格。每个元素的原子序数都是唯一的,永远不会改变。因此,与其对每列使用自动递增的整数,不如仅使用原子序数可能更有意义,对吗?

Would the same be true if I had a table of books? Should I use the ISBN or an auto-incrementing integer for the primary key? Or a table of employees containing each person's SSN?

如果我有一张书桌,情况也会如此吗?我应该对主键使用 ISBN 还是自动递增的整数?还是包含每个人 SSN 的员工表?

采纳答案by Pablo Santa Cruz

There are a lot of already addressed questions on Stack Overflow that can help you with your questions. See here, here, hereand here.

Stack Overflow 上有很多已经解决的问题,可以帮助您解决问题。请参阅此处此处此处此处

The term you should be looking for: surrogated keys.

您应该寻找的术语:surrogated keys

Hope it helps.

希望能帮助到你。

回答by Jay

This is a highly debated question, with lots of emotion on both sides.

这是一个备受争议的问题,双方都有很多情绪。

In my humble opinion, if there's a good, useable natural key available -- like an ISBN -- I use it. I'm going to store it on the database anyway. Yes, a natural key is usually bigger than an integer auto-increment key, but I think this issue is overblown. Disk space is cheap today. I'd worry more about it taking longer to process. If you were talking about an 80 byte text field as a primary key, I'd say no. But if you're thinking of using a 10-byte ISBN instead of an 8-byte big integer, I can't imagine that brings much of a performance penalty.

以我的拙见,如果有一个好的、可用的自然键——比如 ISBN——我会使用它。无论如何,我要将其存储在数据库中。是的,自然键通常比整数自动递增键大,但我认为这个问题被夸大了。今天的磁盘空间很便宜。我更担心处理时间更长。如果你说一个 80 字节的文本字段作为主键,我会说不。但是,如果您考虑使用 10 字节的 ISBN 而不是 8 字节的大整数,我无法想象这会带来多大的性能损失。

Sometimes there's a performance advantage to natural keys. Suppose, for example, I want to find how many copies of a given book have been sold. I don't care about any of the data from the Book master record. If the primary key is ISBN, I could simply write "select count(*) from sale where isbn='143573338X'". If I used an autoincrement key, I would have to do a join to look up the isbn, and the query becomes more complex and slower, like "select count(*) from book join sale using (bookid) where isbn='143573338X'". (And I can assure you that that as this particular ISBN is for my book, the number of sale records is very small, so doing the join and reading one extra record is a big percentage difference!)

有时,自然键具有性能优势。例如,假设我想查找某本书已售出的数量。我不关心 Book 主记录中的任何数据。如果主键是 ISBN,我可以简单地写“select count(*) from sale where isbn='143573338X'”。如果我使用自动增量键,我将不得不做一个连接来查找 isbn,并且查询变得更复杂和更慢,比如“select count(*) from book join sale using (bookid) where isbn='143573338X' ”。(我可以向你保证,由于这个特定的 ISBN 是用于我的书,销售记录的数量非常少,所以进行连接和阅读额外的记录是一个很大的百分比差异!)

Another advantage of natural keys is that when you have to work on the database and you look at records that refer back to this table by key, it's easy to see what record they're referring to.

自然键的另一个优点是,当您必须处理数据库并查看按键引用回该表的记录时,很容易看出它们所指的是什么记录。

On the other hand, if there is no good, obvious natural key, don't try to cobble together a crazy one. I've seen people try to make a natural key by concatenating together the first 6 letters of the customers first name, his year of birth, and his zip code, and then pray that that will be unique. That sort of silliness is just making trouble for yourself. Often people end up taking on a sequence number to insure it's unique anyway, and at that point, why bother? Why not just use the sequence number by itself as the key?

另一方面,如果没有好的、明显的自然键,不要试图拼凑一个疯狂的键。我见过人们试图通过将客户名字的前 6 个字母、他的出生年份和他的邮政编码连接在一起来制作一个自然键,然后祈祷这是唯一的。那种愚蠢只会给自己找麻烦。通常人们最终会使用一个序列号来确保它无论如何都是唯一的,在这一点上,为什么要麻烦呢?为什么不单独使用序列号作为密钥?

回答by Craig Warren

You've got the idea right there.

你的想法就在那里。

Auto-increment should be used as a unique key when no unique key already exists about the items you are modelling. So for Elements you could use the Atomic Number or Books the ISBN number.

当您正在建模的项目不存在唯一键时,应将自动增量用作唯一键。因此,对于 Elements,您可以使用 Atomic Number 或 Books 使用 ISBN 编号。

But if people are posting messages on a message board then these need a unique ID, but don't contain one naturally so we assign the next number from a list.

但是,如果人们在留言板上发布消息,那么这些消息需要一个唯一的 ID,但不自然包含 ID,因此我们从列表中分配下一个编号。

It make sense to use natural keys where possible, just remember to make the field as the primary key and ensure that it is indexed for performance

在可能的情况下使用自然键是有意义的,只需记住将该字段作为主键并确保对其进行索引以提高性能

回答by Rob Goodwin

The main problem that I have seen with the auto incrementing an integer approach is when you export your data to bring into another db instance, or even an archive and restore operation. Because the integer has no relation to the data that it references, there is no way to determine if you have duplicates when restoring or adding data to an existing database. If you want no relationship between the data contained in the row and the PK, I would just use a guid. Not very user friendly to look at, but it solves the above problem.

我在自动递增整数方法中看到的主要问题是当您导出数据以引入另一个数据库实例,甚至是存档和还原操作时。由于整数与其引用的数据无关,因此无法确定在将数据还原或添加到现有数据库时是否存在重复项。如果您不希望行中包含的数据与 PK 之间没有关系,我将只使用 guid。看起来不太用户友好,但它解决了上述问题。

回答by Klaus Byskov Pedersen

With regards to using ISBN and SSN you really have to Think about how many rows in other tables are going to reference these through foreign keys because those ids will take up much more space than an integer and thus may lead to a waste of disk space and possibly to worse join performance.

关于使用 ISBN 和 SSN,你真的必须考虑其他表中有多少行将通过外键引用它们,因为这些 id 将占用比整数更多的空间,从而可能导致磁盘空间的浪费和可能会使连接性能变差。

回答by Keith Adler

I'm trying to figure out the "best practices" for deciding whether or not to add an auto-incrementing integer as the primary key to a table.

我试图找出决定是否将自动递增整数作为主键添加到表的“最佳实践”。

Use it as a unique identifier with a dataset where the PKey is not part of user managed data.

将其用作数据集的唯一标识符,其中 PKey 不是用户管理数据的一部分。

Let's say I have a table containing data about the chemical elements. The atomic number of each element is unique and will never change. So rather than using an auto-incrementing integer for each column, it would probably make more sense to just use the atomic number, correct?

假设我有一个包含有关化学元素数据的表格。每个元素的原子序数都是唯一的,永远不会改变。因此,与其对每列使用自动递增的整数,不如仅使用原子序数可能更有意义,对吗?

Yes.

是的。

Would the same be true if I had a table of books? Should I use the ISBN or an auto-incrementing integer for the primary key? Or a table of employees containing each person's SSN?

如果我有一张书桌,情况也会如此吗?我应该对主键使用 ISBN 还是自动递增的整数?还是包含每个人 SSN 的员工表?

ISBNs/SS#s are assigned by third-parties and because of their large storage size would be a highly inefficient way to uniquely identify a row. Remember, PKeys are useful when you join tables. Why use a large data format like an ISBN which would be numerous textual characters as the Unique identifier when a small and compact format like Integer is available?

ISBN/SS# 由第三方分配,并且由于它们的大存储容量将是一种非常低效的唯一标识行的方式。请记住,当您连接表时,PKey 很有用。当像整数这样的小而紧凑的格式可用时,为什么要使用像 ISBN 这样的大型数据格式作为唯一标识符呢?

回答by Simon

Old topic I know, but one other thing to consider is that given that most RDBMSes lay out blocks on disk using the PK, using an auto-incrementing PK will simply massively increase your contention. This may not be an issue for your baby database you're mucking around with, but believe me it can cause massive performance issues at the bigger end of town.

我知道老话题,但要考虑的另一件事是,鉴于大多数 RDBMS 使用 PK 在磁盘上布置块,使用自动递增 PK 只会大大增加您的争用。对于您正在处理的婴儿数据库来说,这可能不是问题,但相信我,它可能会在更大的城镇造成巨大的性能问题。

If you mustuse an auto-incrementing ID, maybe consider using it as partof a PK. Tack it on the end to maintain uniqueness.....

如果您必须使用自动递增的 ID,可以考虑将其用作PK 的一部分。最后贴上它以保持唯一性.....

Also, it is best to exhaust all possibilities for natural PKs before jumping to a surrogate. People are generally lazy with this.

此外,最好在跳转到代理之前用尽自然 PK 的所有可能性。人们通常对此很懒惰。