java 我应该使用复合主键吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/963809/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-29 14:36:24  来源:igfitidea点击:

Should I use composite primary keys or not?

javasqllanguage-agnosticdatabase-design

提问by Mike

There seems to only be 2nd class support for composite database keys in Java's JPA (via EmbeddedId or IdClass annotations). And when I read up on composite keys, regardless of language, people keep coming across as them being a bad thing. But I cannot understand why. Are composite keys still acceptable to use these days? If not, why not?

Java 的 JPA 中似乎只有对复合数据库键的第二类支持(通过 EmbeddedId 或 IdClass 注释)。当我阅读复合键时,无论语言如何,人们总是认为它们是一件坏事。但我不明白为什么。这些天还可以使用复合键吗?如果没有,为什么不呢?

I've found one person who agrees with me: http://weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx

我找到了一个同意我的人:http: //weblogs.sqlteam.com/jeffs/archive/2007/08/23/composite_primary_keys.aspx

But another who doesn't: http://weblogs.java.net/blog/bleonard/archive/2006/11/using_composite.html

但另一个没有:http: //weblogs.java.net/blog/bleonard/archive/2006/11/using_composite.html

Is it just me, or are people not able to make the distinction of where a composite key is appropriate or not? I see composite primary keys useful when the table doesn't represent an entity - i.e. when it represents a join table.

是只有我,还是人们无法区分复合键在哪里合适?当表不代表实体时,我看到复合主键很有用 - 即当它代表连接表时。

A simple example:

一个简单的例子:

Actor { Id, Name, Email } Movie { Id, Name, Year } Character { Id, Name } Role { Actor, Movie, Character }

Actor { Id, Name, Email } Movie { Id, Name, Year } Character { Id, Name } Role { Actor, Movie, Character }

Here Actor, Movie and Character obviously benefit from having an Id column as the primary key.

在这里,演员、电影和角色显然受益于将 Id 列作为主键。

But Role is a Many-To-Many join table. I see no point in creating an id just to identify a row in the database. To me it seems obvious that the primary key is { Actor, Movie, Character }. It also seems like a rather limiting feature, especially if the data in the join table changes all the time, you could find yourself with primary key collisions once the primary key sequence wraps around to 0.

但 Role 是多对多连接表。我认为仅仅为了标识数据库中的一行而创建一个 id 是没有意义的。对我来说,主键似乎很明显是{ Actor, Movie, Character }. 这似乎也是一个相当有限的功能,特别是如果连接表中的数据一直在变化,一旦主键序列环绕为 0,您可能会发现自己遇到了主键冲突。

So, back to the original question, is it still acceptable practice to use composite primary keys? If not, why not?

那么,回到最初的问题,使用复合主键是否仍然可以接受?如果没有,为什么不呢?

采纳答案by Billy

I think there's no problem using a composite key.

我认为使用复合键没有问题。

To me the database it's a component on its own, that should be treated the same way we treat code : for instance we want clean code, that communicates clearly its intent, that does one thing and does it well, that doesn't add any uneeded level of complexity, etc.

对我来说,数据库本身就是一个组件,应该像对待代码一样对待它:例如,我们想要干净的代码,清楚地传达其意图,做一件事并且做得很好,不添加任何不必要的复杂程度等。

Same thing with the db, if the PK is composite, this is the reality, so the model should be kept clean and clear. A composite PK it's clearer than the mix auto-increment + constraint. When you see an ID column that does nothing you need to ask what's the real PK, are there any other hidden things that you should be aware of, etc. A clear PK doesn't leave any doubts.

db也是一样,如果PK是composite,这就是现实,所以模型要保持干净清晰。复合 PK 它比混合自动增量 + 约束更清晰。当你看到一个什么都不做的ID栏时,你需要问什么是真正的PK,还有什么其他隐藏的东西你应该注意等等。一个清晰的PK不会留下任何疑问。

The db is the base of your app, to me we need the most solid base that we can have. On this base we'll build the app ( web or not ). So I can't see why we should bend the db model to conform to some specific in one development tool/framework/language. The data is directing the application, not the other way around. What if the ORM changes in the future and becomes obsolete and a better solution appears that imposes another model ? We can't play with the db model to fit this or that framework, the model should stay the same, it should not depend on what tool we're using to access the data ...

db 是你的应用程序的基础,对我来说,我们需要我们可以拥有的最坚实的基础。在此基础上,我们将构建应用程序(无论是否为网络)。所以我不明白为什么我们应该弯曲 db 模型以符合一种开发工具/框架/语言中的某些特定内容。数据正在指导应用程序,而不是相反。如果 ORM 在未来发生变化并变得过时,并且出现了强加另一个模型的更好的解决方案怎么办?我们不能使用 db 模型来适应这个或那个框架,模型应该保持不变,它不应该取决于我们使用什么工具来访问数据......

If the db model change in the future, it should change because functionality changed. If we would know today how this functionality will change, we'll be modeling this already. ANd any future change will be dealt with when the time comes, we can't predict for instance the impact on existing data, so one extra column doesn't guarantee that it will withold any future change ...

如果将来 db 模型发生变化,它应该会因为功能发生变化而发生变化。如果我们今天知道此功能将如何变化,我们将已经对其进行建模。并且当时间到来时将处理任何未来的更改,例如我们无法预测对现有数据的影响,因此额外的一列并不能保证它将保留任何未来的更改......

We should design for today's functionality, and keep the db model the simplest possible, this way it will be easy to change/evolve in the future.

我们应该针对今天的功能进行设计,并尽可能保持 db 模型最简单,这样在未来很容易改变/发展。

回答by Roee Adler

In my personal opinion you should avoid composite primary keys due to several reasons:

在我个人看来,由于以下几个原因,您应该避免使用复合主键:

  1. Future changes: when you design a database you sometimes miss what in the future will become important. A significant example for this is thinking a combination of two or more fields is unique (and thus can become a primary key), whereas in the future you want to allow NULLs or other non-unique values in them. Having a single primary key is a good solid solution against such changes.

  2. Uniformity: If every table has a unique numerical ID, and you also maintain some standard as to its name (e.g. "ID" or "tablename_id"), the code and SQL referring to it is clearer (in my opinion).

  1. 未来的变化:当你设计一个数据库时,你有时会错过未来将变得重要的东西。一个重要的例子是认为两个或多个字段的组合是唯一的(因此可以成为主键),而将来您希望在其中允许 NULL 或其他非唯一值。拥有一个主键是应对此类变化的一个很好的可靠解决方案。

  2. 统一性:如果每个表都有一个唯一的数字 ID,并且您对其名称也保持一定的标准(例如“ID”或“tablename_id”),则引用它的代码和 SQL 会更清晰(在我看来)。

There are other reasons, but these are just a few.

还有其他原因,但这些只是几个原因。

The main question I would ask is why not use a separate primary key if you have a unique set of fields? What's the cost? An additional integer index? That's not too bad.

我要问的主要问题是,如果您有一组唯一的字段,为什么不使用单独的主键?费用是多少?一个额外的整数索引?这还不错。

Hope that helps.

希望有帮助。

回答by devio

Similar questionshave been asked on SO, and there is no consensus ;)

已经在 SO 上提出了类似的问题,但没有达成共识;)

If you develop a web application, you will love single column pk's, as they make your URLs simpler.

如果您开发 Web 应用程序,您会喜欢单列 pk,因为它们使您的 URL 更简单。

For a sequence to wrap you'd need 2 billion records in a single table (32bit), or 10^18 with 64 bit pk's.

对于要包装的序列,您需要在单个表(32 位)中包含 20 亿条记录,或者 10^18 条记录与 64 位 pk。

Btw, your data model does not allow for movie characters with unknown actors.

顺便说一句,您的数据模型不允许使用未知演员的电影角色。

回答by cletus

My general opinion is... no. don't use composite primary keys.

我的总体看法是……不。不要使用复合主键。

They will typically complicate ORMs if you use them (ORMs sometimes go so far as to call composite primary keys "legacy behaviour") and generally if you're using multiple keys, one or more of them will tend to be natural rather than technical keys, which for me is the bigger problem: IMHO you should certainly favour technical primary keys.

如果您使用它们,它们通常会使 ORM 复杂化(ORM 有时甚至将复合主键称为“遗留行为”),并且通常如果您使用多个键,其中一个或多个键往往是自然键而不是技术键,这对我来说是更大的问题:恕我直言,您当然应该喜欢技术主键。

More on this in Database Development Mistakes Made by AppDevelopers.

AppDevelopers 所犯的数据库开发错误中详细介绍了这一点

回答by cletus

Religious wars have been, and still are, going on on this subject.

关于这个主题的宗教War已经并且仍然在继续。

OO people have this zealous thing about "identity", and will tell you that the only thing that matters is the ability for you to "identify" "real-life objects" inside your programs, and that composite, "real-life" keys will only get you into trouble when trying to achieve that goal.

面向对象的人对“身份”非常热衷,并且会告诉您唯一重要的是您能够“识别”程序中“现实生活中的对象”,以及复合的“现实生活”键只会让您在尝试实现该目标时遇到麻烦。

Data people have this thing about "uniqueness" that is perceived as "zealous" by the OO side, and will tell you that the only thing that matters is that if the business tells you that the combination of (values for) attribute X and attribute Y must be unique, then it is your job to see to it that the database enforces this business rule of uniqueness of the combined X+Y.

数据人员对“唯一性”有这样的看法,这被 OO 方视为“热心”,并且会告诉您唯一重要的是,如果业务告诉您属性 X 和属性的(值)的组合Y 必须是唯一的,那么您的工作就是确保数据库强制执行此 X+Y 组合唯一性的业务规则。

How you want your question answered is just a matter of which religion you prefer. My personal religion is the Data one. That religion has proven to be able to survive any hype and trend ever since 1969.

您希望如何回答您的问题只是您喜欢哪种宗教的问题。我个人的信仰是数据一。自 1969 年以来,这种宗教已被证明能够在任何炒作和趋势中幸存下来。

回答by onedaywhen

It's a religious thing. I use natural keys and shun surrogates. I have no problem with composite keys either in theory or in practice.

这是宗教的事情。我使用自然键并避免代理。无论是在理论上还是在实践中,我对复合键都没有问题。

Only the most trivial logical model would involve no composite keys. Call me lazy but I see no need to complicate the data model by introducing surrogates into the physical model on implementation. Sure, I'd consider one on a table if performance issues were found but I take the same approach as for denormalization i.e. as a last resort. Habitually using surrogates amounts to premature optimization, IMO.

只有最简单的逻辑模型才不会涉及复合键。叫我懒惰,但我认为没有必要通过在实现时将代理引入物理模型来使数据模型复杂化。当然,如果发现性能问题,我会考虑将一个放在桌子上,但我采用与非规范化相同的方法,即作为最后的手段。习惯性地使用代理相当于过早的优化,IMO。

回答by Arjan

In Ruby for Rails, when not explicitly specifying otherwise, your Roletable would be kind of like you described (if the columns are actually the IDs from the other tables). Still, in the database you might want to ensure unique combinations by defining a unique index on those three columns, if only to help the database optimizing your queries. With that unique index in place and the framework not using any other primary key anyway, there is no need for a an additional numeric primary key in your Roletable. Having said that, the unique index could could very be defined as a composite primary key instead.

在 Ruby for Rails 中,当没有明确指定时,您的Role表将类似于您所描述的(如果列实际上是来自其他表的 ID)。尽管如此,在数据库中,您可能希望通过在这三列上定义唯一索引来确保唯一组合,如果只是为了帮助数据库优化您的查询。有了该唯一索引并且框架无论如何都不使用任何其他主键,您的Role表中就不需要额外的数字主键了。话虽如此,唯一索引完全可以定义为复合主键。

As for future changes: defining a strict database for your first iteration will prevent unexpected data to be persisted, which will make migrations much easier.

至于未来的变化:为您的第一次迭代定义一个严格的数据库将防止意外数据被持久化,这将使迁移更加容易。

So: I would use composite primary keys.

所以:我会使用复合主键。

回答by Stuart Ellis

I would only ever use them in join tables. The only way to absolutely ensure that every record identifier is unique and consistent over time is to use a synthetic key.

我只会在连接表中使用它们。绝对确保每个记录标识符随着时间的推移都是唯一和一致的唯一方法是使用合成密钥。

Composite keys seem OK in theory, which is why they are tempting to use, but practice has shown that they usually indicate that there is a flaw in your data model. Worse still, in many cases they will fail to guarantee uniqueness, given a large enough data set. And data sets always grow over time, so using them may mean that you have planted a bomb in your application which will only explode when the application has been in production use for a while.

复合键在理论上似乎没问题,这就是为什么它们很容易使用,但实践表明它们通常表明您的数据模型中存在缺陷。更糟糕的是,在许多情况下,如果数据集足够大,它们将无法保证唯一性。并且数据集总是随着时间的推移而增长,因此使用它们可能意味着您在应用程序中植入了一个炸弹,只有当应用程序在生产中使用一段时间后才会爆炸。

I think that people are underplaying ORMs. Every mainstream programming language has a defacto ORM, and has had for years, because they solve the fundamental incompatibility between OO and relational structures. Trying to write any complex, testable OO software against SQL databases without an ORM is very inefficient, at best.

我认为人们低估了 ORM。每种主流编程语言都有一个事实上的 ORM,并且已经存在多年,因为它们解决了 OO 和关系结构之间的根本不兼容问题。尝试针对没有 ORM 的 SQL 数据库编写任何复杂的、可测试的 OO 软件,充其量是非常低效的。

Good ORMs also provide practices and tooling that make it much easier to create and maintain consistent high-quality database schema, so on average, a team will come out well ahead by working with an ORM. Handcrafting schema is rather like writing C++ ...people can do it, but in the real world it is so hard to maintain quality over time that the average product is not good.

好的 ORM 还提供实践和工具,可以更轻松地创建和维护一致的高质量数据库模式,因此平均而言,团队将通过使用 ORM 取得领先。手工模式就像编写 C++ ......人们可以做到,但在现实世界中,随着时间的推移很难保持质量,以至于一般产品都不好。

回答by Mitch Wheat

In terms of the domain model, I see nothing wrong with creating a composite primary key when the table doesn't represent an entity - i.e. when it represents a join table (as you mention in your question), other than if it is not montonically increasing, then you will get a certain amount of page splits during insertions.

就域模型而言,当表不代表实体时,我认为创建复合主键没有任何问题 - 即当它代表连接表时(如您在问题中提到的),除非它不是单调的增加,那么您将在插入期间获得一定数量的页面拆分。

Some ORM's don't cope well with composite primary keys, so perhaps it is safer to create a surrogate auto-integer for the primary key, and cover the columns with a non-clustered index.

一些 ORM 不能很好地处理复合主键,所以为主键创建一个代理自动整数,并用非聚集索引覆盖列可能更安全。

回答by HLGEM

I have almost never seen a case where a composite key was a good idea (exception, joining table consisting of only two surrogate keys). In the first palce you are wasting space in the child tables. You are harming performance in the joins as integer joins are generally much faster. If you have the composite key as a clustered index (talking SQL Server here), then you are causing the database to be less efficient about storing records and less efficient in building other indexes - all of which use the clusterd index.

我几乎从未见过复合键是个好主意的情况(例外,连接表仅包含两个代理键)。在第一个地方,您正在浪费子表中的空间。您正在损害连接中的性能,因为整数连接通常要快得多。如果您将复合键作为聚集索引(这里指的是 SQL Server),那么您将导致数据库存储记录的效率较低,并且构建其他索引的效率较低 - 所有这些都使用聚集索引。

When the data in the key changes (As it almost inevitably will) then you need to update all related tables as well casuing massive unecessary updates and wasting processing power on a task that is completely uneeded when the database is designed to use surrogaste keys. Primary keys need not only to be unique but to be unchanging. Composite keys often fail the second test.

当键中的数据发生变化时(几乎不可避免地会发生变化),那么您需要更新所有相关表,并导致大量不必要的更新,并在数据库设计为使用代理键时完全不需要的任务上浪费处理能力。主键不仅要唯一,而且要保持不变。复合键通常无法通过第二次测试。

So you are thinking of using a technique that harms performance, causes poor use of memory and database storage, uses way more space in child records (another waste of resources) and requires painful updating of what may be millions of child records when things change. And which might make it hard to use an ORM? Why would you do that? Because you are too lazy to put a surrogate key on and then define a unique index on the potential composite key? Is there any gain at all to using a composite index? For the lack of 5 minutes of work you are permanently harming your database?

因此,您正在考虑使用一种会损害性能、导致内存和数据库存储使用不当、在子记录中使用更多空间(另一种资源浪费)的技术,并且需要在情况发生变化时痛苦地更新数百万个子记录。哪些可能会使使用 ORM 变得困难?为什么要这么做?是因为你懒得放代理键,然后在潜在的组合键上定义唯一索引?使用复合索引有什么好处吗?由于缺少 5 分钟的工作,您是否会永久损害您的数据库?