MySQL varchar(20) 和 varchar(50) 是一样的吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/11132357/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 13:54:39  来源:igfitidea点击:

varchar(20) and varchar(50) are same?

mysqlvarchar

提问by Habeeb Perwad

I saw comment "If you have 50 million values between 10 and 15 characters in a varchar(20) column, and the same 50 million values in a varchar(50) column, they will take up exactly the same space. That's the whole point of varchar, as opposed to char.". Can Anybody tell me the reason? See What is a reasonable length limit on person "Name" fields?

我看到评论“如果 varchar(20) 列中有 10 到 15 个字符之间的 5000 万个值,而 varchar(50) 列中有相同的 5000 万个值,它们将占用完全相同的空间。这就是重点varchar,而不是 char。”。谁能告诉我原因?请参阅人员“姓名”字段的合理长度限制是多少?

回答by Jirka Hanika

MySQL offers a choice of storage engines. The physical storage of data depends on the storage engine.

MySQL 提供了多种存储引擎。数据的物理存储取决于存储引擎。

MyISAM Storage of VARCHAR

VARCHAR 的 MyISAM 存储

In MyISAM, VARCHARs typically occupy just the actual length of the string plus a byte or two of length. This is made practical by the design limitation of MyISAM to table locking as opposed to a row locking capability. Performance consequences include a more compact cache profile, but also more complicated (slower) computation of record offsets.

在 MyISAM 中,VARCHARs 通常只占用字符串的实际长度加上一或两个字节的长度。由于 MyISAM 对表锁定的设计限制而不是行锁定功能,这使得这一点变得切实可行。性能结果包括更紧凑的缓存配置文件,以及更复杂(更慢)的记录偏移计算。

(In fact, MyISAM gives you a degree of choicebetween fixed physical row size and variable physical row size table formats depending on column types occuring in the whole table. Occurrence of VARCHARchanges the default method only, but the presence of a TEXTblob forcesVARCHARs in the same table to use the variable length method as well.)

(实际上,MyISAM根据整个表中出现的列类型在固定物理行大小和可变物理行大小表格式之间提供一定程度的选择。发生VARCHAR更改仅默认方法,但TEXTblob的存在会强制VARCHARs同一个表也使用可变长度方法。)

The physical storage method is particularly important with indexes, which is a different story than tables. MyISAM uses space compression for bothCHARand VARCHARcolumns, meaning that shorter data take up less space in the index in both cases.

物理存储方法对于索引尤其重要,这与表不同。MyISAMCHARVARCHAR使用空间压缩,意味着在这两种情况下,较短的数据在索引中占用的空间较少。

InnoDB Storage of VARCHAR

VARCHAR 的 InnoDB 存储

InnoDB, like most other current relational databases, uses a more sophisticated mechanism. VARCHARcolumns whose maximum width is less than 768 bytes will be stored inline, with room reserved matching that maximum width. More accurately here:

InnoDB 与大多数其他当前关系数据库一样,使用更复杂的机制。 VARCHAR最大宽度小于 768 字节的列将被内联存储,并保留与最大宽度匹配的空间。更准确的在这里

For each non-NULL variable-length field, the record header contains the length of the column in one or two bytes. Two bytes will only be needed if part of the column is stored externally in overflow pages or the maximum length exceeds 255 bytes and the actual length exceeds 127 bytes. For an externally stored column, the two-byte length indicates the length of the internally stored part plus the 20-byte pointer to the externally stored part. The internal part is 768 bytes, so the length is 768+20. The 20-byte pointer stores the true length of the column.

对于每个非 NULL 可变长度字段,记录头包含一或两个字节的列长度。仅当列的一部分存储在溢出页中或最大长度超过 255 字节且实际长度超过 127 字节时,才需要两个字节。对于外部存储的列,两字节长度表示内部存储部分的长度加上指向外部存储部分的 20 字节指针。内部部分是768字节,所以长度是768+20。20 字节的指针存储列的真实长度。

InnoDB currently does not do space compression in its indexes, the opposite of MyISAM as described above.

InnoDB 目前没有在其索引中进行空间压缩,这与上面描述的 MyISAM 相反。

Back to the question

回到问题

All of the above is however just an implementational detail that may even change between versions. The true difference between CHARand VARCHARis semantic, and so is the one between VARCHAR(20)and VARCHAR(50). By ensuring that there is no way to store a 30 character string in a VARCHAR(20), the database makes the life easier and better defined for various processors and applications that it supposedly integrates into a predictably behaving solution. This is the big deal.

然而,以上所有只是一个实现细节,甚至可能在版本之间发生变化。之间的真正区别CHARVARCHAR是语义的,所以是之间的一个VARCHAR(20)VARCHAR(50)。通过确保无法在 a 中存储 30 个字符的字符串VARCHAR(20),该数据库使各种处理器和应用程序的生活变得更轻松,更好地定义了它应该集成到一个可预测的行为解决方案中。这是大问题。

Regarding personal names specifically, this questionmay give you some practical guidance. People with full names over 70 UTF-8 characters are in trouble anyway.

具体人名,这个问题或许能给你一些实用的指导。全名超过 70 个 UTF-8 字符的人无论如何都会遇到麻烦。

回答by Thilo

Yes, that is indeed the whole point of VARCHAR. It only takes up as much space as the text is long.

是的,这确实是 VARCHAR 的全部意义所在。它只占用与文本长度一样多的空间。

If you had CHAR(50), it would take up 50 bytes (or characters) no matter how short the data really is (it would be padded, usually by spaces).

如果你有 CHAR(50),不管数据有多短,它都会占用 50 个字节(或字符)(它会被填充,通常用空格)。

Can Anybody tell me the reason?

谁能告诉我原因?

Because people thought it was wasteful to store a lot of useless padding, they invented VARCHAR.

因为人们认为存储大量无用的填充很浪费,所以他们发明了 VARCHAR。

回答by RandomSeed

The manualstates:

该手册指出:

The CHAR and VARCHAR types are declared with a length that indicates the maximum number of characters you want to store. (...)

In contrast to CHAR, VARCHAR values are stored as a one-byte or two-byte length prefix plus data. The length prefix indicates the number of bytes in the value. A column uses one length byte if values require no more than 255 bytes, two length bytes if values may require more than 255 bytes.

CHAR 和 VARCHAR 类型声明的长度指示要存储的最大字符数。(……)

与 CHAR 相比,VARCHAR 值存储为一字节或两字节长度的前缀加上数据。长度前缀表示值中的字节数。如果值需要不超过 255 个字节,则列使用一个长度字节,如果值可能需要超过 255 个字节,则使用两个长度字节。

Notice that VARCHAR(255) is notthe same as VARCHAR(256).

注意,VARCHAR(255)是一样的VARCHAR(256)。

This is theory. As habeebperwad suggests, the actual footprint of one rowdepends on (engine) page size and (hard disk) block size.

这是理论。正如 habeebperwad 所说,一行的实际占用空间取决于(引擎)页面大小和(硬盘)块大小。