SQL 使用电子邮件地址作为主键?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3804108/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Use email address as primary key?
提问by robert
Is email address a bad candidate for primary when compared to auto incrementing numbers?
与自动递增数字相比,电子邮件地址是主要的不良候选者吗?
Our web application needs the email address to be unique in the system. So, I thought of using email address as primary key. However my colleague suggests that string comparison will be slower than integer comparison.
我们的 Web 应用程序需要电子邮件地址在系统中是唯一的。所以,我想到使用电子邮件地址作为主键。但是我的同事建议字符串比较会比整数比较慢。
Is it a valid reason to not use email as primary key?
不使用电子邮件作为主键是一个正当理由吗?
We are using PostgreSQL
.
我们正在使用PostgreSQL
.
回答by Sjoerd
String comparison is slower than int comparison. However, this does not matter if you simply retrieve a user from the database using the e-mail address. It does matter if you have complex queries with multiple joins.
字符串比较比 int 比较慢。但是,如果您只是使用电子邮件地址从数据库中检索用户,这并不重要。如果您有多个连接的复杂查询,这很重要。
If you store information about users in multiple tables, the foreign keys to the users table will be the e-mail address. That means that you store the e-mail address multiple times.
如果您在多个表中存储有关用户的信息,则用户表的外键将是电子邮件地址。这意味着您多次存储电子邮件地址。
回答by HLGEM
I will also point out that email is a bad choice to make a unique field, there are people and even small businesses that share an email address. And like phone numbers, emails can get re-used.[email protected] can easily belong to John Smith one year and Julia Smith two years later.
我还要指出,电子邮件对于创建一个独特的领域来说是一个糟糕的选择,有些人甚至小型企业会共享一个电子邮件地址。与电话号码一样,电子邮件也可以重复使用。[email protected] 很容易在一年后属于 John Smith,两年后属于 Julia Smith。
Another problem with emails is that they change frequently. If you are joining to other tables with that as the key, then you will have to update the other tables as well which can be quite a performance hit when an entire client company changes their emails (which I have seen happen.)
电子邮件的另一个问题是它们经常更改。如果您以它为键加入其他表,那么您还必须更新其他表,当整个客户公司更改他们的电子邮件(我已经看到这种情况发生)时,这可能会对性能造成很大影响。
回答by Steven A. Lowe
the primary key should be uniqueand constant
主键应该是唯一且恒定的
email addresses change like the seasons. Useful as a secondary key for lookup, but a poor choice for the primary key.
电子邮件地址会随着季节而变化。作为查找的辅助键很有用,但对于主键来说是一个糟糕的选择。
回答by Jay
Disadvantages of using an email address as a primary key:
使用电子邮件地址作为主键的缺点:
Slower when doing joins.
Any other record with a posted foreign key now has a larger value, taking up more disk space. (Given the cost of disk space today, this is probably a trivial issue, except to the extent that the record now takes longer to read. See #1.)
An email address could change, which forces all records using this as a foreign key to be updated. As email address don't change all that often, the performance problem is probably minor. The bigger problem is that you have to make sure to provide for it. If you have to write the code, this is more work and introduces the possibility of bugs. If your database engine supports "on update cascade", it's a minor issue.
进行连接时速度较慢。
任何其他带有已发布外键的记录现在具有更大的值,占用更多的磁盘空间。(考虑到今天的磁盘空间成本,这可能是一个微不足道的问题,除了现在读取记录需要更长的时间。参见 #1。)
电子邮件地址可能会更改,这会强制更新所有使用此地址作为外键的记录。由于电子邮件地址不会经常更改,因此性能问题可能很小。更大的问题是你必须确保提供它。如果您必须编写代码,这是更多的工作并引入了错误的可能性。如果您的数据库引擎支持“在更新级联”,这是一个小问题。
Advantages of using email address as a primary key:
使用电子邮件地址作为主键的优点:
You may be able to completely eliminate some joins. If all you need from the "master record" is the email address, then with an abstract integer key you would have to do a join to retrieve it. If the key is the email address, then you already have it and the join is unnecessary. Whether this helps you any depends on how often this situation comes up.
When you are doing ad hoc queries, it's easy for a human being to see what master record is being referenced. This can be a big help when trying to track down data problems.
You almost certainly will need an index on the email address anyway, so making it the primary key eliminates one index, thus improving the performance of inserts as they now have only one index to update instead of two.
您也许能够完全消除某些连接。如果您只需要“主记录”中的电子邮件地址,那么使用抽象整数键,您就必须进行连接以检索它。如果密钥是电子邮件地址,那么您已经拥有它并且不需要加入。这是否对您有帮助取决于这种情况出现的频率。
当您进行即席查询时,人们很容易看到正在引用的主记录。在尝试追踪数据问题时,这会很有帮助。
无论如何,您几乎肯定需要在电子邮件地址上建立索引,因此将其设为主键会消除一个索引,从而提高插入的性能,因为它们现在只有一个索引要更新,而不是两个。
In my humble opinion, it's not a slam-dunk either way. I tend to prefer to use natural keys when a practical one is available because they're just easier to work with, and the disadvantages tend to not really matter much in most cases.
在我看来,无论哪种方式,这都不是灌篮。当有实用的键时,我倾向于使用自然键,因为它们更容易使用,而且在大多数情况下,缺点往往并不重要。
回答by meriton
It is pretty bad. Assume some e-mail provider goes out of business. Users will then want to change their e-mail. If you have used e-mail as primary key, all foreign keys for users will duplicate that e-mail, making it pretty damn hard to change ...
这很糟糕。假设某个电子邮件提供商倒闭了。用户随后会想要更改他们的电子邮件。如果您使用电子邮件作为主键,则用户的所有外键都将复制该电子邮件,这使得更改非常困难......
... and I haven't even started talking about performance considerations.
...我什至还没有开始谈论性能方面的考虑。
回答by xlttj
I don't know if that might be an issue in your setup, but depending on your RDBMS the values of a columns might be case sensitive. PostgreSQL docs say: ?If you declare a column as UNIQUE or PRIMARY KEY, the implicitly generated index is case-sensitive“. In other words, if you accept user input for a search in a table with email as primary key, and the user provides "[email protected]", you won't find “[email protected]".
我不知道这是否可能是您设置中的问题,但根据您的 RDBMS,列的值可能区分大小写。PostgreSQL 文档说: ?如果您将列声明为 UNIQUE 或 PRIMARY KEY,则隐式生成的索引区分大小写“。换句话说,如果您接受用户输入以在以电子邮件为主键的表中进行搜索,并且用户提供“[email protected]”,则您将找不到“[email protected]”。
回答by Simen Echholt
No one seems to have mentioned a possible problem that email addresses could be considered private. If the email address is the primary key, a profile page URL most likely will look something like ..../Users/[email protected]
. What if you don't want to expose the user's email address? You'd have to find some other way of identifying the user, possibly by a unique integer value to make URLs like ..../Users/1
. Then you'd end up with a unique integer value after all.
似乎没有人提到电子邮件地址可能被视为私人的可能问题。如果电子邮件地址是主键,则个人资料页面 URL 很可能类似于..../Users/[email protected]
. 如果您不想公开用户的电子邮件地址怎么办?您必须找到一些其他方式来识别用户,可能通过唯一的整数值来制作像..../Users/1
. 那么你最终会得到一个唯一的整数值。
回答by Rafa
At the logicallevel, the email is the natural key. At the physicallevel, given you are using a relational database, the natural key doesn't fit well as the primary key. The reason is mainly the performance issues mentioned by others.
在逻辑层面,电子邮件是自然键。在物理级别,假设您使用的是关系数据库,自然键不适合作为主键。原因主要是别人提到的性能问题。
For that reason, the design can be adapted. The natural key becomes the alternate key(UNIQUE, NOT NULL), and you use a surrogate/artificial/technical keyas the primary key, which can be an auto-increment in your case.
因此,可以调整设计。自然键成为备用键(UNIQUE,NOT NULL),您使用代理/人工/技术键作为主键,在您的情况下可以是自动增量。
systempuntoout asked,
systempuntoout 问,
What if someone wants to change his email address? Are you going to change all the foreign keys too?
如果有人想更改他的电子邮件地址怎么办?你也要改变所有的外键吗?
That's what cascadingis for.
这就是级联的用途。
Another reason to use a numeric surrogate key as the primary key is related to how the indexing works in your platform. In MySQL's InnoDB, for example, all indexes in a table have the primary key pre-pended to them, so you want the PK to be as small as possible (for speed's and size's sakes). Also related to this, InnoDB is faster when the primary key is stored in sequence, and a string would not help there.
使用数字代理键作为主键的另一个原因与索引在您的平台中的工作方式有关。例如,在 MySQL 的 InnoDB 中,表中的所有索引都预先设置了主键,因此您希望 PK 尽可能小(为了速度和大小)。同样与此相关的是,当主键按顺序存储时,InnoDB 速度更快,而字符串在那里无济于事。
Another thing to take into consideration when using a string as an alternate key, is that using a hash of the actual string that you want might be faster, skipping things like upper and lower cases of some letters. (I actually landed here while looking for a reference to confirm what I just said; still looking...)
使用字符串作为备用键时要考虑的另一件事是,使用您想要的实际字符串的散列可能会更快,跳过某些字母的大小写等内容。(我实际上是在寻找参考资料以确认我刚才所说的话时降落在这里的;仍在寻找......)
回答by Bryan Legend
Yes, it is a bad primary key because your users will want to update their email addresses.
是的,这是一个糟糕的主键,因为您的用户会想要更新他们的电子邮件地址。
回答by ibram
yes, it is better if you use an integer instead. you can also set your email column as unique constraint.
是的,最好使用整数代替。您还可以将您的电子邮件列设置为唯一约束。
like this:
像这样:
CREATE TABLE myTable(
id integer primary key,
email text UNIQUE
);