database 主键中允许 NULL - 为什么以及在哪个 DBMS 中?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/3906811/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
NULL permitted in Primary Key - why and in which DBMS?
提问by Gennady Vanin Геннадий Ванин
Further to my question "Why to use ′not null primary key′ in TSQL?"...
进一步回答我的问题“为什么在 TSQL 中使用‘非空主键’?” ...
As I understood from other discussions, some RDBMS (for example SQLite, MySQL) permit "unique" NULL in the primary key.
正如我从其他讨论中了解到的,一些 RDBMS(例如 SQLite、 MySQL) 允许主键中的“唯一”NULL。
Why is this allowed and how might it be useful?
为什么允许这样做以及它如何有用?
Background: I believe it is beneficial for communication with colleagues and database professionals to know the differences in fundamental concepts, approaches and their implementations in different DBMS.
背景:我认为了解不同 DBMS 中基本概念、方法及其实现的差异对与同事和数据库专业人员的交流是有益的。
Notes
笔记
- MySQL is rehabilitated and returned to the "NOT NULL PK" list.
- SQLite has been added (thanks to Paul Hadfield) to "NULL PK" list:
- MySQL 被修复并返回到“NOT NULL PK”列表。
- SQLite 已添加到“NULL PK”列表中(感谢 Paul Hadfield):
For the purposes of determining the uniqueness of primary key values, NULL values are considered distinct from all other values, including other NULLs.
If an INSERT or UPDATE statement attempts to modify the table content so that two or more rows feature identical primary key values, it is a constraint violation. According to the SQL standard, PRIMARY KEY should always imply NOT NULL. Unfortunately, due to a long-standing coding oversight, this is not the case in SQLite.
Unless the column is an INTEGER PRIMARY KEY SQLite allows NULL values in a PRIMARY KEY column. We could change SQLite to conform to the standard (and we might do so in the future), but by the time the oversight was discovered, SQLite was in such wide use that we feared breaking legacy code if we fixed the problem.
So for now we have chosen to continue allowing NULLs in PRIMARY KEY columns. Developers should be aware, however, that we may change SQLite to conform to the SQL standard in future and should design new programs accordingly.
为了确定主键值的唯一性,NULL 值被视为与所有其他值(包括其他 NULL)不同。
如果 INSERT 或 UPDATE 语句试图修改表内容以使两行或更多行具有相同的主键值,则违反约束。根据 SQL 标准,PRIMARY KEY 应该总是暗示 NOT NULL。不幸的是,由于长期的编码监督,SQLite 并非如此。
除非该列是 INTEGER PRIMARY KEY SQLite 允许 PRIMARY KEY 列中的 NULL 值。我们可以更改 SQLite 以符合标准(并且我们将来可能会这样做),但是当发现疏忽时,SQLite 的使用范围如此广泛,以至于我们担心如果修复问题会破坏遗留代码。
所以现在我们选择继续在 PRIMARY KEY 列中允许 NULL。但是,开发人员应该知道,我们将来可能会更改 SQLite 以符合 SQL 标准,并应相应地设计新程序。
回答by Erwin Smout
Suppose you have a primary key containing a nullable column Kn.
假设您有一个包含可为空列 Kn 的主键。
If you want to have a second row rejected on the ground that in that second row, Kn is null and the table already contains a row with Kn null, then you are actually requiring that the system would treat the comparison "row1.Kn = row2.Kn" as giving TRUE (because you somehow want the system to detect that the key values in those rows are indeed equal). However, this comparison boils down to the comparison "null = null", and the standard already explicitly specifies that null doesn't compare equal to anything, including itself.
如果您想拒绝第二行,因为在第二行中 Kn 为空,并且表中已经包含一个 Kn 为空的行,那么您实际上要求系统将比较“row1.Kn = row2 .Kn" 为 TRUE(因为您以某种方式希望系统检测到这些行中的键值确实相等)。然而,这种比较归结为比较“null = null”,并且标准已经明确规定null不等于任何东西,包括它自己。
To allow for what you want, would thus amount to SQL deviating from its own principles regarding the treatment of null. There are innumerable inconsistencies in SQL, but this particular one never got past the committee.
考虑到您想要的东西,因此相当于 SQL 偏离了它自己关于 null 处理的原则。SQL 中有无数不一致的地方,但这个特殊的地方从未通过委员会。
回答by Philip Kelley
As far as relational database theory is concerned:
就关系数据库理论而言:
- The primary key of a table is used to uniquely identify each and every row in the table
- A NULL value in a column indicates that you don't konw what the value is
- Therefore, you should never use the value of "I don't know" to uniquely identify a row in a table.
- 表的主键用于唯一标识表中的每一行
- 列中的 NULL 值表示您不知道该值是什么
- 因此,永远不要使用“我不知道”的值来唯一标识表中的行。
Depending upon the data you are modelling, a "made up" value can be used instead of NULL. I've used 0, "N/A", 'Jan 1, 1980', and similar values to represent dummy "known to be missing" data.
根据您建模的数据,可以使用“编造”值而不是 NULL。我使用了 0、“N/A”、“Jan 1, 1980”和类似的值来表示“已知丢失”的虚拟数据。
Most, if not all, DB engines do allow for a UNIQUE constraint or index, which does allow for NULL column values, though (ideally) only one row may be assigned the value null (otherwise it wouldn't be a unique value). This can be used to support the irritatingly pragmatic (but occasionally necessary) situations that don't fit neatly into relational theory.
大多数(如果不是全部)数据库引擎确实允许 UNIQUE 约束或索引,这确实允许 NULL 列值,但(理想情况下)只能为一行分配 null 值(否则它不会是唯一值)。这可用于支持不完全适合关系理论的令人恼火的务实(但有时是必要的)情况。
回答by Hammerite
I don't know whether older versions of MySQL differ on this, but as of modern versions a primary key must be on columns that are not null. See the manual page on CREATE TABLE
: "A PRIMARY KEY
is a unique index where all key columns must be defined as NOT NULL
. If they are not explicitly declared as NOT NULL
, MySQL declares them so implicitly (and silently)."
我不知道旧版本的 MySQL 在这方面是否有所不同,但从现代版本开始,主键必须位于不为空的列上。请参阅手册页CREATE TABLE
:“APRIMARY KEY
是一个唯一索引,其中所有键列都必须定义为NOT NULL
。如果它们没有显式声明为NOT NULL
,MySQL 会隐式地(并且静默地)声明它们。”
回答by Andrzej Doyle
Well, it could allow you to implement the Null Object Patternnatively within the database. So if you were using something similar in code, which interacted very closely with the DB, you could just look up the object corresponding to the key without having to special-case a null check.
好吧,它可以让您在数据库中本地实现空对象模式。因此,如果您在代码中使用类似的东西,它与数据库的交互非常密切,您可以只查找与键对应的对象,而无需进行特殊情况下的空检查。
Now whether this is worthwhile functionality I'm not sure, but it's really a question of whether the pros of disallowing null pkeys in absolutely all cases outweigh the cons of obstructing someone who (for better or worse) actually wants to use null keys. This would only be worth it if you could demonstrate some non-trivial improvement (such as faster key lookup) from being able to guarantee that keys are non-null. Some DB engines would show this, others might not. And if there aren't any real pros from forcingthis, why artificially restrict your clients?
现在我不确定这是否是有价值的功能,但它确实是一个问题,即在绝对所有情况下禁止空密钥的优点是否超过阻碍某人(无论好坏)实际上想要使用空密钥的缺点。如果您可以证明能够保证键为非空的一些重要改进(例如更快的键查找),那么这才值得。一些数据库引擎会显示这一点,其他的可能不会。如果强迫这样做没有任何真正的好处,为什么要人为地限制您的客户?
回答by Periata Breatta
As discussed in other answers, NULL was intended to mean "the information that should go in this column is unknown". However, it is alsofrequently used to indicate an alternative meaning of "this attribute does not exist". This is a particularly useful interpretation when looking at timestamp fields that are interpreted as the time some particular event occurred, in which case NULL is often used to indicate that the event has not yet occurred.
正如其他答案中所讨论的那样,NULL 旨在表示“该列中应包含的信息是未知的”。但是,它也经常用于表示“此属性不存在”的另一种含义。当查看被解释为某个特定事件发生时间的时间戳字段时,这是一个特别有用的解释,在这种情况下,NULL 通常用于指示事件尚未发生。
It is a problem that SQL doesn't support this interpretation very well -- for this to work properly, it really needs to have a separate value (something like "never") that doesn't behave as null does ("never" should be equal to "never" and should compare as higher than all other values). But as SQL lacks this notion, and there is no convenient way to add it, using null for this purposes is often the best choice.
SQL 不能很好地支持这种解释是一个问题——为了让它正常工作,它确实需要一个单独的值(比如“never”),它的行为不像 null 那样(“never”应该等于“从不”并且应该比所有其他值都高)。但是由于 SQL 缺乏这个概念,并且没有方便的方法来添加它,为此目的使用 null 通常是最好的选择。
This leaves the problem that when a timestamp of an event that may have not occurred should be part of the primary key of a table (a common requirement perhaps being the use of a natural key along with a deletion timestamp when using soft deletion with a requirement for the ability to recreate the item after deletion) you really want the primary key to have a nullable column. Alas, this is not allowed in most databases, and instead you have to resort to an artificial primary key (e.g. a row sequence number) and a UNIQUE constraint for what should otherwise have been your actual primary key.
这留下了一个问题,当一个可能没有发生的事件的时间戳应该是表的主键的一部分时(一个常见的要求可能是在使用带有要求的软删除时使用自然键和删除时间戳)为了能够在删除后重新创建项目),您确实希望主键具有可为空的列。唉,这在大多数数据库中是不允许的,相反,您必须求助于人工主键(例如行序列号)和 UNIQUE 约束,否则应该是您的实际主键。
An example scenario, in order to clarify this: I have a users
table. As I require each user to have a distinct username, I decide to use username
as the primary key. I want to support user deletion, but as I need to track the existence of users historically for auditing purposes I use soft deletion (in the first version of the schema, I add a 'deleted' flag to the user, and ensure that the deleted flag is checked in all queries where only active users are expected).
一个示例场景,为了澄清这一点:我有一张users
桌子。由于我要求每个用户都有一个不同的用户名,因此我决定将其用username
作主键。我想支持用户删除,但由于我需要跟踪历史上用户的存在以进行审计,我使用软删除(在架构的第一个版本中,我向用户添加了一个“已删除”标志,并确保已删除在所有只需要活跃用户的查询中检查标志)。
An additional requirement, however, is that if a username is deleted, it should be available for new users to register. An attractive way to achieve this would be to have the deleted flag change to a nullable timestamp (where nulls indicate that the user has not been deleted) and put this in the primary key. Were primary keys to allow nullable columns, this would have the following effect:
然而,一个额外的要求是,如果用户名被删除,新用户应该可以注册。实现这一点的一个有吸引力的方法是将删除的标志更改为可空的时间戳(其中空值表示用户尚未被删除)并将其放在主键中。如果主键允许可空列,则会产生以下效果:
- Creating a new user with an existing username when that user's
deleted
column is null would be denied as a duplicate key entry - Deleting a user changes its key (which requires changes to cascade to foreign keys that reference the user, which is suboptimal but if deletions are rare is acceptable) so that the
deleted
column is a timestamp for the when the deletion occurred - Now a new user (which would have a null
deleted
timestamp) can be successfully created.
- 当该用户的
deleted
列为空时,使用现有用户名创建新用户将被拒绝作为重复的键条目 - 删除用户会更改其键(这需要更改以级联到引用该用户的外键,这是次优的,但如果删除很少是可以接受的),以便该
deleted
列是删除发生时的时间戳 - 现在
deleted
可以成功创建一个新用户(将具有空时间戳)。
However, this cannot actually be achieved with standard SQL, so instead one must use a different primary key (probably a generated numeric user id in this case) and use a UNIQUE constraint to enforce the uniqueness of (username
,deleted
).
然而,这实际上无法通过标准 SQL 实现,因此必须使用不同的主键(在这种情况下可能是生成的数字用户 ID)并使用 UNIQUE 约束来强制 ( username
, deleted
)的唯一性。
回答by Apollo
Having primary key null can be beneficial in some scenarios. In one of my projects I used this feature during synchronisation of databases: one on server and many on different users devices. Considering the fact that not all users have access to the Internet all the time, I decided that only the main database will be able to give ids to my entities. SQLite has its own mechanism for numbering rows. Had I used additional id field I would use more bandwith. Having null as id not only notifies me that an entity has been created on clients device when he hadn't access to the Internet, but also decreases code complexity. The only drawback is that on clients device I can't get an entity by it's id unless it was previously synchronised with main database. However thats not an issue since my user cares for entities for their parameters, not their unique id.
在某些情况下,主键为 null 可能是有益的。在我的一个项目中,我在数据库同步期间使用了此功能:一个在服务器上,许多在不同的用户设备上。考虑到并非所有用户都可以一直访问 Internet,我决定只有主数据库才能为我的实体提供 ID。SQLite 有自己的行编号机制。如果我使用了额外的 id 字段,我会使用更多的带宽。将 null 作为 id 不仅会通知我在客户端无法访问 Internet 时在客户端设备上创建了一个实体,而且还降低了代码复杂性。唯一的缺点是在客户端设备上我无法通过它的 id 获取实体,除非它之前与主数据库同步。然而这不是问题,因为我的用户关心实体的参数,