MySQL 中的基数是什么?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2566211/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 15:41:54  来源:igfitidea点击:

What is cardinality in MySQL?

mysqlindexing

提问by OM The Eternity

What is cardinality in MySQL? Please explain in simple, non-technical language.

MySQL 中的基数是什么?请用简单的非技术语言解释。

If a index detail of any table displays the cardinality of a field say group_idas 11, then what does that mean?

如果任何表的索引详细信息显示字段的基数group_id为 11,那么这意味着什么?

回答by Alexander Torstling

Max cardinality: All values are unique

最大基数:所有值都是唯一的

Min cardinality: All values are the same

最小基数:所有值都相同

Some columns are called high-cardinality columns because they have constraints in place (like unique) prohibiting you from putting the same value in every row.

一些列被称为高基数列,因为它们有限制(如唯一),禁止您在每一行中放置相同的值。

Cardinality is a property which affects the ability to cluster, sort and search data. It is therefore an important measurement for the query planners in DBs, it is a heuristic which they can use to choose the best plans.

基数是影响聚类、排序和搜索数据能力的属性。因此,对于数据库中的查询计划者来说,这是一个重要的衡量标准,它是一种启发式方法,可以用来选择最佳计划。

回答by Kami

Wikipedia summarizes cardinality in SQLas follows:

维基百科将SQL 中的基数总结如下:

In SQL(Structured Query Language), the term cardinalityrefers to the uniquenessof data values contained in a particular column (attribute) of a databasetable. The lower the cardinality, the more duplicated elements in a column. Thus, a column with the lowest possible cardinality would have the same value for every row. SQL databases use cardinality to help determine the optimal query planfor a given query.

SQL(结构化查询语言)中,术语基数是指包含在数据库的特定列(属性)中的数据值的唯一性。基数越低,列中重复的元素越多。因此,具有最低基数的列的每一行都具有相同的值。SQL 数据库使用基数来帮助确定给定查询的最佳查询计划

回答by Rhapsody

It is an estimate of the number of unique values in the index.

它是对索引中唯一值数量的估计。

For a table with a single primary key column, the cardinality should normally be equal to the number of rows in the table.

对于具有单个主键列的表,基数通常应等于表中的行数。

More information.

更多信息

回答by Martin Smith

It's basically associated with the degree of uniqueness of a column's values as per the Wikipedia article linked to by Kami.

根据 Kami 链接的维基百科文章,它基本上与列值的唯一性程度相关联。

Why it is important to consider is that it affects indexing strategy. There will be little point indexing a low cardinality column with only 2 possible values as the index will not be selective enough to be used.

为什么重要的是要考虑它会影响索引策略。索引一个只有 2 个可能值的低基数列几乎没有意义,因为该索引的选择性不够,无法使用。

回答by Zon

The higher cardinality, the better is differentiation of rows. Differentiation helps navigating less branches to get data.

基数越高,行的区分越好。差异化有助于导航更少的分支来获取数据。

Therefore higher cordinality values mean:

因此,较高的亲和度值意味着:

  • better performance of read-queries;
  • bigger database size;
  • worse performance of write-queries, because hidden index data is being updated.
  • 更好的读取查询性能;
  • 更大的数据库大小;
  • 写查询的性能更差,因为隐藏的索引数据正在更新。

回答by user3112246

In mathematical terms, cardinality is the count of values in a set of values. A set can only contain unique values. An example would be the set "A".

在数学术语中,基数是一组值中值的计数。一个集合只能包含唯一值。一个例子是集合“A”。

Let the set "A" be: A={1,2,3} - the cardinality of that set is |3|.

设集合“A”为: A={1,2,3} - 该集合的基数为 |3|。

If set "A" contains 5 values A={10,21,33,42,57}, then the cardinality is |5|.

如果集合“A”包含 5 个值 A={10,21,33,42,57},则基数为 |5|。

What that means in the context of mysql is that the cardinality of a table column is the count of that column's unique values. If you are looking at the cardinality of your primary key column (eg. table.id), then the cardinality of that column will tell you how many rows that table contains, as there is one unique ID for each row in the table. You don't have to perform a "COUNT(*)" on that table to find out how many rows it has, simply look at the cardinality.

这在 mysql 的上下文中意味着表列的基数是该列的唯一值的计数。如果您正在查看主键列(例如 table.id)的基数,那么该列的基数将告诉您该表包含多少行,因为表中的每一行都有一个唯一 ID。您不必对该表执行“COUNT(*)”以找出它有多少行,只需查看基数即可。

回答by Junjie Li

From the manual:

手册

Cardinality

An estimate of the number of unique values in the index. This is updated by running ANALYZE TABLE or myisamchk -a. Cardinality is counted based on statistics stored as integers, so the value is not necessarily exact even for small tables. The higher the cardinality, the greater the chance that MySQL uses the index when doing joins.

基数

对索引中唯一值数量的估计。这是通过运行 ANALYZE TABLE 或 myisamchk -a 来更新的。基数是根据存储为整数的统计信息计算的,因此即使对于小表,该值也不一定准确。基数越高,MySQL 在进行连接时使用索引的机会就越大。

And an analysis from Percona:

来自 Percona分析

CREATE TABLE `antest` (
  `i` int(10) unsigned NOT NULL,
  `c` char(80) default NULL,
  KEY `i` (`i`),
  KEY `c` (`c`,`i`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1

mysql> select count(distinct c) from antest;
+-------------------+
| count(distinct c) |
+-------------------+
|               101 |
+-------------------+
1 row in set (0.36 sec)


mysql> select count(distinct i) from antest;
+-------------------+
| count(distinct i) |
+-------------------+
|               101 |
+-------------------+
1 row in set (0.20 sec)

mysql> select count(distinct i,c) from antest;
+---------------------+
| count(distinct i,c) |
+---------------------+
|               10201 |
+---------------------+
1 row in set (0.43 sec)

mysql> show index from antest;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| antest |          1 | i        |            1 | i           | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| antest |          1 | c        |            1 | c           | A         |        NULL |     NULL | NULL   | YES  | BTREE      |         |
| antest |          1 | c        |            2 | i           | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.00 sec)

mysql> analyze table sys_users;
+--------------------------------+---------+----------+----------+
| Table                          | Op      | Msg_type | Msg_text |
+--------------------------------+---------+----------+----------+
| antest                         | analyze | status   | OK       |
+--------------------------------+---------+----------+----------+
1 row in set (0.01 sec)


mysql> show index from antest;
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table  | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| antest |          1 | i        |            1 | i           | A         |         101 |     NULL | NULL   |      | BTREE      |         |
| antest |          1 | c        |            1 | c           | A         |         101 |     NULL | NULL   | YES  | BTREE      |         |
| antest |          1 | c        |            2 | i           | A         |       10240 |     NULL | NULL   |      | BTREE      |         |
+--------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
3 rows in set (0.01 sec)

回答by Aayush

In a simple way, cardinality is the number of rows or tuples within the table. No. of columns is called "degree"

简单来说,基数是表中的行数或元组数。列数称为“度数”