SQL 主键作为文本
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/15477005/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Primary key as text
提问by artaxerxe
In the databases course that I did during my education (approx. 4 years ago), I thought that it is recommended avoiding the use of character stringsas primary key's data type.
在我接受教育期间(大约 4 年前)上的数据库课程中,我认为建议避免使用字符串作为主键的数据类型。
Can someone tell me what are the pros and cons for choosing a character varyingdata type for primary key in SQL and how much the above premise is true?
有人能告诉我在 SQL 中为主键选择字符变化数据类型的优缺点是什么,以及上述前提在多大程度上是正确的?
N.B.: (I'm using PostgreSQLdatabase). I'm also dealing with a situation when you need to reference such a table from another, thus putting foreign key on character varyingdata type. Please take in account that also.
注意:(我使用的是PostgreSQL数据库)。我也在处理一种情况,当您需要从另一个表引用这样的表时,因此将外键放在字符变化的数据类型上。请也考虑到这一点。
采纳答案by DF_
The advantages you have for choosing a character datatype as a primary key field is that you may choose what data it can show. As an example, you could have the email address as the key field for a users table. The eliminates the need for an additional column. Another advantage is if you have a common data table that holds indexes of multiple other tables (think a NOTES table with an external reference to FINANCE, CONTACT, and ADMIN tables), you can easily know what table this came from (e.g. your FINANCE table has an index of F00001, CONTACT table has an index of C00001, etc). I'm afraid the disadvantages are going to be greater larger in this reply as I'm against such an approach.
选择字符数据类型作为主键字段的优点是您可以选择它可以显示的数据。例如,您可以将电子邮件地址作为用户表的关键字段。无需额外的色谱柱。另一个优点是,如果您有一个公共数据表,其中包含多个其他表的索引(想想具有对 FINANCE、CONTACT 和 ADMIN 表的外部引用的 NOTES 表),您可以轻松知道它来自哪个表(例如您的 FINANCE 表有一个索引 F00001,CONTACT 表有一个索引 C00001,等等)。由于我反对这种方法,我担心此回复中的缺点会更大。
The disadvantages are as follows:
缺点如下:
- The serial datatype exists for exactly this reason in PostgreSQL
- Numeric indexes will be entered in order and minimal reindexing will need to be done (i.e. if you have a table with keys Apple, Carrot and want to insert Banana, the table will have to move around the indexes so that Banana is inserted in the middle. You will rarely insert data in the middle of an index if the index is numeric).
- Numeric indexes unlinked from data are not going to change.
- Numeric indexes are shorter and their length can be fixed (4 bytes vs whatever you pick as your varchar length).
- 正是因为这个原因,在 PostgreSQL 中存在串行数据类型
- 将按顺序输入数字索引,并且需要进行最少的重新索引(即,如果您有一个带有键 Apple、Carrot 的表并且想要插入 Banana,则该表将必须在索引周围移动,以便将 Banana 插入中间。如果索引是数字,您很少会在索引中间插入数据)。
- 与数据断开链接的数字索引不会改变。
- 数字索引较短并且它们的长度可以固定(4 个字节与您选择的 varchar 长度相比)。
In your case you can still put a foreign key on a numeric index, so I'm not sure why you would want to force it to be a varchar type. Searching and filtering on a numeric field is theoretically faster than a text field as the server will be forced to convert the data first. Generally speaking, you would have a numeric primary key that is non-clustered, and then create a clustered key on your data column that you are going to filter a lot.
在您的情况下,您仍然可以将外键放在数字索引上,所以我不确定您为什么要强制它为 varchar 类型。理论上,在数字字段上搜索和过滤比文本字段更快,因为服务器将被迫首先转换数据。一般来说,您会有一个非聚集的数字主键,然后在您要过滤很多的数据列上创建一个聚集键。
Those are general standards when writing SQL, but when it comes to benchmarking, you will only find that varchar columns are a little slower on joining and filtering than integer columns. As long as your primary keys are not changing EVER then you're alright.
这些是编写 SQL 时的通用标准,但在进行基准测试时,您只会发现 varchar 列在连接和过滤方面比整数列慢一点。只要您的主键永远不会改变,那么您就可以了。