SQL - 多对多表主键
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2190272/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL - many-to-many table primary key
提问by Andy White
This question comes up after reading a comment in this question:
阅读此问题中的评论后出现此问题:
When you create a many-to-many table, should you create a composite primary key on the two foreign key columns, or create a auto-increment surrogate "ID" primary key, and just put indexes on your two FK columns (and maybe a unique constraint)? What are the implications on performance for inserting new records/re-indexing in each case?
创建多对多表时,是应该在两个外键列上创建复合主键,还是创建一个自增代理“ID”主键,然后将索引放在两个 FK 列上(也许唯一约束)?在每种情况下插入新记录/重新索引对性能有什么影响?
Basically, this:
基本上,这个:
PartDevice
----------
PartID (PK/FK)
DeviceID (PK/FK)
vs. this:
与这个:
PartDevice
----------
ID (PK/auto-increment)
PartID (FK)
DeviceID (FK)
The commenter says:
评论者说:
making the two IDs the PK means the table is physically sorted on the disk in that order. So if we insert (Part1/Device1), (Part1/Device2), (Part2/Device3), then (Part 1/Device3) the database will have to break the table apart and insert the last one between entries 2 and 3. For many records, this becomes very problematic as it involves shuffling hundreds, thousands, or millions of records every time one is added. By contrast, an autoincrementing PK allows the new records to be tacked on to the end.
使这两个 ID 成为 PK 意味着该表在磁盘上按该顺序物理排序。因此,如果我们插入 (Part1/Device1)、(Part1/Device2)、(Part2/Device3),那么 (Part1/Device3) 数据库将不得不拆分表并在条目 2 和 3 之间插入最后一个。对于许多记录,这变得非常有问题,因为每次添加一个记录时都会对数百、数千或数百万条记录进行混洗。相比之下,自动递增的 PK 允许将新记录添加到末尾。
The reason I'm asking is because I've always been inclined to do the composite primary key with no surrogate auto-increment column, but I'm not sure if the surrogate key is actually more performant.
我问的原因是因为我一直倾向于使用没有代理自动增量列的复合主键,但我不确定代理键是否实际上性能更高。
采纳答案by paxdiablo
With a simple two-column many-to-many mapping, I see no real advantage to having a surrogate key. Having a primary key on (col1,col2)
is guaranteed unique (assuming your col1
and col2
values in the referenced tables are unique) and a separate index on (col2,col1)
will catch those cases where the opposite order would execute faster. The surrogate is a waste of space.
通过简单的两列多对多映射,我认为拥有代理键没有真正的优势。(col1,col2)
保证主键是唯一的(假设引用表中的col1
和col2
值是唯一的),并且单独的索引(col2,col1)
将捕获相反顺序执行速度更快的情况。代理是浪费空间。
You won't need indexes on the individual columns since the table should only ever be used to join the two referenced tables together.
您不需要在各个列上建立索引,因为该表应该只用于将两个引用的表连接在一起。
That comment you refer to in the question is not worth the electrons it uses, in my opinion. It sounds like the author thinks the table is stored in an array rather than an extremely high performance balanced multi-way tree structure.
在我看来,你在问题中提到的那个评论不值得它使用的电子。听起来作者认为表存储在数组中,而不是性能极高的平衡多路树结构。
For a start, it's never necessary to store or get at the tablesorted, just the index. And the index won't be storedsequentially, it'll be stored in an efficient manner to be able to be retrieved quickly.
首先,永远不需要存储或获取已排序的表,只需要索引。并且索引不会按顺序存储,它将以有效的方式存储以便能够快速检索。
In addition, the vast majority of database tables are read farmore often than written. That makes anything you do on the select side far more relevant than anything on the insert side.
此外,绝大多数数据库表的读取频率远高于写入频率。这使得您在选择端所做的任何事情都比在插入端所做的任何事情都更相关。
回答by gbn
No surrogate key is needed for link tables.
链接表不需要代理键。
One PK on (col1, col2) and another unique index on (col2, col1) is all you need
(col1, col2) 上的一个 PK 和 (col2, col1) 上的另一个唯一索引就是您所需要的
Unless you use an ORM that can't cope and dictates your DB design for you...
除非您使用的 ORM 无法应对并为您指定数据库设计......
Edit: I answered the same here: SQL: Do you need an auto-incremental primary key for Many-Many tables?
编辑:我在这里回答了同样的问题:SQL:你需要一个多对多表的自动增量主键吗?
回答by Jronny
An incremental primary key could be needed if the table is referenced. There might be details in the many-to-many table which needed to be pulled up from another table using the incremental primary key.
如果引用该表,则可能需要增量主键。多对多表中可能存在需要使用增量主键从另一个表中提取的详细信息。
for example
例如
PartDevice
----------
ID (PK/auto-increment)
PartID (FK)
DeviceID (FK)
Other Details
It's easy to pull the 'Other Details' using PartDevice.ID as the FK. Thus the use of incremental primary key is needed.
使用 PartDevice.ID 作为 FK 可以轻松提取“其他详细信息”。因此需要使用增量主键。
回答by Bernhard Hofmann
The shortest and most direct way I can answer your question is to say that there will be a performance impact if the two tables you are linking don't have sequential primary keys. As you stated/quoted, the index for the link table will either become fragmented, or the DBMS will work harder to insert records if the link table does not have its own sequential primary key. This is the reason most people put a sequentially incrementing primary key on link tables.
我可以回答您的问题的最短和最直接的方法是说,如果您要链接的两个表没有顺序主键,则会对性能产生影响。正如您所说/引用的,如果链接表没有自己的顺序主键,链接表的索引将变得碎片化,或者 DBMS 将更加努力地插入记录。这就是大多数人在链接表上放置顺序递增的主键的原因。
回答by michael kosak
So it seems like if the ONLY job is to link the two tables, the best PK would be the dual-column PK.
因此,如果唯一的工作是链接两个表,那么最好的 PK 将是双列 PK。
But if it serves other purposes then add another NDX as a PK with a foreign keys and a second unique index.
但是,如果它用于其他目的,则添加另一个 NDX 作为具有外键和第二个唯一索引的 PK。
Index or PK is the best way to make sure there are no duplicates. PK lets tools like Microsoft Management Studio do some of the work (creating views) for you
索引或 PK 是确保没有重复项的最佳方式。PK 让 Microsoft Management Studio 等工具为您完成一些工作(创建视图)