MySQL Mysql如何建立聚集索引?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/13779138/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Mysql How do you create a clustered index?
提问by Wiz
I'm reading all about how clustered indexes work, and think they would be beneficial to my app. I understand that primary keys are automatically clustered indexes, but how would you add a clustered index to a non-primary key column?
我正在阅读有关聚集索引如何工作的所有内容,并认为它们对我的应用程序有益。我知道主键是自动聚集索引,但是如何将聚集索引添加到非主键列?
I.e. a datastore for user posts. Each post has a ID, but also has a user-id, but since users can post multiple times, the user-id is not a primary key. How would you add a clustered index to the user-id, and is that even a good idea?
即用户帖子的数据存储。每个帖子都有一个ID,也有一个user-id,但是由于用户可以多次发帖,所以user-id不是主键。您将如何向用户 ID 添加聚集索引,这是否是一个好主意?
回答by Olaf Dietsche
According to Clustered and Secondary Indexes, you can have only one clustered index per table.
根据Clustered and Secondary Indexes,每个表只能有一个聚集索引。
All indexes other than the clustered index are known as secondary indexes.
除聚集索引外的所有索引都称为二级索引。
If a table has no primary index but another unique index, this is used as the clustered index.
如果表没有主索引但有另一个唯一索引,则将其用作聚集索引。
If you do not define a PRIMARY KEY for your table, MySQL locates the first UNIQUE index where all the key columns are NOT NULL and InnoDB uses it as the clustered index.
如果您没有为您的表定义 PRIMARY KEY,MySQL 将定位第一个 UNIQUE 索引,其中所有键列都不是 NULL,InnoDB 将其用作聚集索引。
So, I would conclude, that you don't add a clustered index yourself, but MySQL chooses either the primary or the first unique index of a table as the clustered index.
因此,我得出的结论是,您不会自己添加聚集索引,但 MySQL 会选择表的主索引或第一个唯一索引作为聚集索引。
If you haven't defined a primary or unique index, MySQL creates an index itself
如果你没有定义主索引或唯一索引,MySQL 会自己创建一个索引
If the table has no PRIMARY KEY or suitable UNIQUE index, InnoDB internally generates a hidden clustered index named GEN_CLUST_INDEX on a synthetic column containing row ID values. The rows are ordered by the ID that InnoDB assigns to the rows in such a table. The row ID is a 6-byte field that increases monotonically as new rows are inserted. Thus, the rows ordered by the row ID are physically in insertion order.
如果 table 没有 PRIMARY KEY 或合适的 UNIQUE 索引,则 InnoDB 在包含行 ID 值的合成列上内部生成名为 GEN_CLUST_INDEX 的隐藏聚集索引。行按 InnoDB 分配给此类表中行的 ID 排序。行 ID 是一个 6 字节的字段,随着插入新行而单调增加。因此,按行 ID 排序的行在物理上是按插入顺序排列的。
回答by ypercube??
As @Olaf describes, InnoDB chooses which column or column combination will be the clustered index (the primary key, or the first unique index if there is not a primary key, or a hidden column if there is none of the two).
正如@Olaf 所描述的,InnoDB 选择哪个列或列组合将成为聚集索引(主键,如果没有主键,则为第一个唯一索引,如果两者都没有,则为隐藏列)。
If you want to have a non-unique column as the clustered index, you could define the post_id
as a unique key and make the combination of user_id
and post_id
the primary key which will be chosen as the clustered index:
如果你想有一个非唯一列作为聚集索引,你可以定义post_id
为一个独特的密钥,使组合user_id
以及post_id
将被选为聚集索引的主键:
CREATE TABLE Post
( post_id INT NOT NULL AUTO_INCREMENT
, user_id INT NOT NULL
--- other columns
, CONSTRAINT Post_PK
PRIMARY KEY (user_id, post_id) -- your clustered index
, CONSTRAINT post_id_UQ
UNIQUE (post_id) -- you still want uniqueness for the `post_id`
) ENGINE = InnoDB ;
Whether this is a good idea or not depends on your application, the data volumes and the queries you have. In general the best properties of a clustered key are unique, narrow, static and ever-increasing. That's why auto-incrementing columns are best. Read about it in Kimberly L. Tripp's blog articles: Ever-increasing clustering key - the Clustered Index Debate..........again!and The Clustered Index Debate Continues...(don't stop because they are for SQL-Server, the same issues apply 99% to InnoDB's clustered indexing)
这是否是一个好主意取决于您的应用程序、数据量和您的查询。一般来说,集群键的最佳属性是唯一的、狭窄的、静态的和不断增加的。这就是为什么自动递增列是最好的。在 Kimberly L. Tripp 的博客文章中阅读它:不断增加的聚集键 - 聚集索引辩论..........再次!和聚集索引辩论仍在继续......(不要因为它们是针对 SQL-Server 而停止的,同样的问题 99% 都适用于 InnoDB 的聚集索引)
A clustered key like the (user_id, post_id)
has the first 3 properties but it is not ever-increasing. This will result in fragmentation of the CI and possibly slower insertions into the table.
像 那样的聚集键(user_id, post_id)
具有前 3 个属性,但它不会不断增加。这将导致 CI 碎片化,并可能导致表中插入速度变慢。
It will however result in more efficient queries that have WHERE user_id = ?
conditions or range conditions WHERE user_id BETWEEN ? AND ?
or GROUP BY user_id
groupings as the wanted data will be found in the clustered index in one place and in the required order.
然而,这将导致具有WHERE user_id = ?
条件或范围条件WHERE user_id BETWEEN ? AND ?
或GROUP BY user_id
分组的更有效的查询,因为所需的数据将在聚集索引中的一个地方以所需的顺序找到。
I suggest you make tests to choose which is best in your case.
我建议您进行测试以选择最适合您的情况。
There is also a MySQL variant, TokuDB, that allows multiple clustered indexes in a table. Details in their article: Introducing Multiple Clustering Indexes
还有一个 MySQL 变体 TokuDB,它允许一个表中有多个聚集索引。他们的文章中的详细信息:介绍多个集群索引
回答by Rick James
I suggest that you are asking the wrong question.
我建议你问错误的问题。
One alternative question is "Can I get rid of my current PRIMARY KEY
so I can make this other thing 'clustered'?" Often an AUTO_INCREMENT
can be eliminated or turned into a simple INDEX
.
另一个问题是“我可以摆脱我的电流,PRIMARY KEY
这样我就可以将另一件事'聚集'吗?” 通常 anAUTO_INCREMENT
可以被消除或变成一个简单的INDEX
。
The more likely question is "What is the optimal index for this SELECT ...
?". Others have pointed out that a secondclustered index is out of the question for basic MySQL, so what is the next choice? I can't answer that without knowing the SELECT
. However my Index Cookbookanswers the question for a large let of SELECTs
.
更可能的问题是“这个的最佳索引是什么SELECT ...
?”。其他人指出,对于基本的 MySQL 来说,第二个聚集索引是不可能的,那么下一个选择是什么?不知道这个,我无法回答SELECT
。但是,我的 Index Cookbook回答了大量SELECTs
.
回答by Gauravk
When you define a primary keyfor an InnoDB table, MySQL uses the primary key as the clustered index.
当您为 InnoDB 表定义主键时,MySQL 使用主键作为聚集索引。
If you do not have a primary key for a table, MySQL will search for the first UNIQUE indexwhere all the key columns are NOT NULL and use this UNIQUE index as the clustered index.
如果您没有表的主键,MySQL 将搜索第一个UNIQUE 索引,其中所有键列都不是 NULL,并使用此 UNIQUE 索引作为聚集索引。
In case the InnoDB table has no primary key or suitable UNIQUE index, MySQL internally generates a hidden clustered index named GEN_CLUST_INDEXon a synthetic column which contains the row ID values.
如果 InnoDB 表没有主键或合适的 UNIQUE 索引,MySQL会在包含行 ID 值的合成列上内部生成一个名为GEN_CLUST_INDEX的隐藏聚集索引。