SQL:使用 NULL 值与默认值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/2116719/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
SQL: Using NULL values vs. default values
提问by Registered User
What are the prosand consof using NULL valuesin SQL as opposed to default values?
哪些优点和缺点使用NULL值在SQL,而不是默认值?
PS. Many similar questions has been asked on here but none answer my question.
附注。这里已经提出了许多类似的问题,但没有人回答我的问题。
回答by R van Rijn
A NULL value in databases is a system value that takes up one byte of storage and indicates that a value is not present as opposed to a space or zero or any other default value. The field in a database containing the NULL value means that the content of this cell is unknown at the time of looking at it. A column that allows NULL values also allows rows to be inserted with no values at all in that column. There are several pros and cons of using NULL values as opposed to default values:
Pros
NULL value does not have the data type, therefore can be inserted to any data structure and any database column. Default values, on the other hand, need to have their data type specified and a default value in one column might look the same in another column, but it might be of a different type.
NULL is often used in schemas where a value is optional. It is a convenient method for omitting data entry for unknown fields without having to implement additional rules, like storing negative values in an integer field to represent omitted data.
Since the NULL value takes up only 1 bit of memory space, they may be useful when optimising the database. Using those values is much more efficient than default values, e.g. character's 8 bits and integer's 16bits.
While your system requirements may change over time and the default value types with them, NULL value is always NULL so there is no need to update the type of data.
Assigning Not Null to table schemas can also help with table validation, in a sense that the column with Not Null criteria will require a value to be inserted. Default values do not have these capabilities.
Cons
NULL values are easily confused with empty character strings, which return a blank value to the user when selected. In this sense, default values are less confusing and are the safer option, unless the default value is set to the empty string.
If NULL values are allowed in the database, they may cause the designer some extra time and work as they can make the database logic more complicated, especially when there are a lot of comparisons to null values in place.
数据库中的 NULL 值是一个系统值,它占用一个字节的存储空间,表示与空格或零或任何其他默认值相反的值不存在。数据库中包含 NULL 值的字段意味着该单元格的内容在查看时是未知的。允许 NULL 值的列还允许在该列中插入没有任何值的行。使用 NULL 值而不是默认值有几个优点和缺点:
优点
NULL 值没有数据类型,因此可以插入到任何数据结构和任何数据库列中。另一方面,默认值需要指定它们的数据类型,并且一列中的默认值在另一列中可能看起来相同,但它可能是不同的类型。
NULL 通常用于值是可选的模式中。这是一种省略未知字段数据条目的便捷方法,而无需实施其他规则,例如在整数字段中存储负值以表示省略的数据。
由于 NULL 值仅占用 1 位内存空间,因此它们在优化数据库时可能很有用。使用这些值比默认值高效得多,例如字符的 8 位和整数的 16 位。
虽然您的系统要求可能会随着时间的推移而改变,默认值类型也会随之改变,但 NULL 值始终为 NULL,因此无需更新数据类型。
将 Not Null 分配给表模式也有助于表验证,从某种意义上说,具有 Not Null 条件的列将需要插入一个值。默认值没有这些功能。
缺点
NULL 值很容易与空字符串混淆,后者在选择时向用户返回一个空白值。从这个意义上说,默认值不太容易混淆并且是更安全的选项,除非默认值设置为空字符串。
如果数据库中允许 NULL 值,它们可能会给设计人员带来一些额外的时间和工作,因为它们会使数据库逻辑更加复杂,尤其是当有大量与 NULL 值的比较时。
Source: Pro and cons
资料来源:利弊
回答by Crozin
I don't know why you're even trying to compare these to cases. null
means that some column is empty/has no value, while default value gives a column some value when we don't set it directly in query.
我不知道为什么你甚至试图将这些与案例进行比较。null
意味着某些列是空的/没有值,而当我们不直接在查询中设置它时,默认值会给列一些值。
Maybe some example will be better explanation. Let's say we've member
table. Each member has an ID and username. Optional he might has an e-mail address (but he doesn't have to). Also each member has a postCount column (which is increased every time user write a post). So e-mail column can have a null
value (because e-mail is optional), while postCount column is NOT NULL
but has default value 0
(because when we create a new member he doesn't have any posts).
也许一些例子会更好地解释。假设我们有member
桌子。每个成员都有一个 ID 和用户名。可选他可能有一个电子邮件地址(但他没有必要)。此外,每个成员都有一个 postCount 列(每次用户写一篇文章时都会增加)。所以 e-mail 列可以有一个null
值(因为 e-mail 是可选的),而 postCount 列NOT NULL
有一个默认值0
(因为当我们创建一个新成员时,他没有任何帖子)。
回答by Philippe Grondier
Null values are not ... values!
空值不是...值!
Null means 'has no value' ... beside the database aspect, one important dimension of non valued variables or fields is that it is not possible to use '=' (or '>', '<'), when comparing variables.
Null 意味着“没有价值”……除了数据库方面,无值变量或字段的一个重要维度是在比较变量时不能使用“=”(或“>”、“<”)。
Writting something like (VB):
写一些类似(VB)的东西:
if myFirstValue = mySecondValue
will not return either True or False if one or both of the variables are non-valued. You will have to use a 'turnaround' such as:
如果一个或两个变量是无值的,则不会返回 True 或 False。您将不得不使用“周转”,例如:
if (isnull(myFirstValue) and isNull(mySecondValue)) or myFirstValue = mySecondValue
The 'usual' code used in such circumstances is
在这种情况下使用的“通常”代码是
if Nz(myFirstValue) = Nz(mySecondValue, defaultValue)
Is not strictly correct, as non-valued variables will be considered as 'equal' to the 'defaultValue' value (usually Zero-length string).
不严格正确,因为非值变量将被视为“等于”“defaultValue”值(通常为零长度字符串)。
In spite of this unpleasant behaviour, never never neverturn on your default values to zero-length string (or '0's) without a valuable reason, and easing value comparison in code is not a valuable reason.
尽管有这种令人不快的行为,但永远不要在没有有价值的原因的情况下将默认值设为零长度字符串(或“0”),并且在代码中简化值比较并不是一个有价值的原因。
回答by paxdiablo
NULL values are meant to indicate that the attribute is either not applicable or unknown. There are religious wars fought over whether they're a good thing or a bad thing but I fall in the "good thing" camp.
NULL 值表示该属性不适用或未知。有宗教War为它们是好事还是坏事而战,但我属于“好事”阵营。
They are often necessary to distinguish known values from unknown values in many situations and they make a sentinel value unnecessary for those attributes that don't have a suitable default value.
在许多情况下,它们通常是区分已知值和未知值所必需的,并且它们使那些没有合适默认值的属性不需要标记值。
For example, whilst the default value for a bank balance may be zero, what is the default value for a mobile phone number. You may need to distinguish between "customer has no mobile phone" and "customer's mobile number is not (yet) known" in which case a blank column won't do (and having an extra column to decide whether that column is one or the other is not a good idea).
例如,虽然银行余额的默认值可能为零,但移动电话号码的默认值是多少。您可能需要区分“客户没有手机”和“客户的手机号码(尚)未知”,在这种情况下,空白列将不起作用(并且有一个额外的列来决定该列是一个还是其他不是一个好主意)。
Default values are simply what the DBMS will put in a column if you don't explicitly specify it.
如果您没有明确指定,默认值就是 DBMS 将放在列中的内容。
回答by Anton Gogolev
To me, they are somewhat orthogonal.
对我来说,它们有些正交。
Default values allow you to gracefully evolve your database schema (think adding columns) without having to modify client code. Plus, they save some typing, but relying on default values for this is IMO bad.
默认值允许您优雅地发展您的数据库架构(想想添加列),而无需修改客户端代码。另外,它们可以节省一些输入,但是依赖默认值是 IMO 不好的。
Nulls are just that: null
s. Missing value and a huge PITA when dealing with Three-Valued Logic.
空值就是这样:null
s。处理三值逻辑时缺少值和巨大的 PITA 。
回答by Bob Jarvis - Reinstate Monica
As with many things, there are good and bad points to each.
与许多事情一样,每件事都有优点和缺点。
Good points about default values: they give you the ability to set a column to a known value if no other value is given. For example, when creating BOOLEAN columns I commonly give the column a default value (TRUE or FALSE, whatever is appropriate) and make the column NOT NULL. In this way I can be confident that the column will have a value, and it'll be set appropriate.
关于默认值的优点:如果没有给出其他值,它们使您能够将列设置为已知值。例如,在创建 BOOLEAN 列时,我通常为该列指定一个默认值(TRUE 或 FALSE,任何合适的值)并使该列 NOT NULL。通过这种方式,我可以确信该列将具有一个值,并且它会被设置为适当的。
Bad points about default values: not everything has a default value.
关于默认值的缺点:并非所有东西都有默认值。
Good things about NULLs: not everything has a known value at all times. For example, when creating a new row representing a person I may not have values for all the columns - let's say I know their name but not their birth date. It's not appropriate to put in a default value for the birth date - people don't like getting birthday cards on January 1st (if that's the default) if their birthday is actually July 22nd.
关于 NULL 的好处:并非所有东西始终都有已知值。例如,当创建一个代表一个人的新行时,我可能没有所有列的值 - 假设我知道他们的名字但不知道他们的出生日期。为出生日期设置默认值是不合适的 - 如果他们的生日实际上是 7 月 22 日,人们不喜欢在 1 月 1 日(如果这是默认值)收到生日贺卡。
Bad things about NULLs: NULLs require careful handling. In most databases built on the relational model as commonly implemented NULLs are poison - the presence of a NULL in a calculation causes the result of the calculation to be NULL. NULLs used in comparisons can also cause unexpected results because any comparison with NULL returns UNKNOWN (which is neither TRUE nor FALSE). For example, consider the following PL/SQL script:
NULL 的坏处:NULL 需要小心处理。在大多数建立在关系模型上的数据库中,通常实现的 NULL 是毒药 - 计算中存在 NULL 会导致计算结果为 NULL。比较中使用的 NULL 也可能导致意外结果,因为任何与 NULL 的比较都会返回 UNKNOWN(既不是 TRUE 也不是 FALSE)。例如,考虑以下 PL/SQL 脚本:
declare
nValue NUMBER;
begin
IF nValue > 0 THEN
dbms_output.put_line('nValue > 0');
ELSE
dbms_output.put_line('nValue <= 0');
END IF;
IF nValue <= 0 THEN
dbms_output.put_line('nValue <= 0');
ELSE
dbms_output.put_line('nValue > 0');
END IF;
end;
The output of the above is:
上面的输出是:
nValue <= 0
nValue > 0
This may be a little surprising. You have a NUMBER (nValue) which is both less than or equal to zero and greater than zero, at least according to this code. The reason this happens is that nValue is actually NULL, and all comparisons with NULL result in UNKNOWN instead of TRUE or FALSE. This can result in subtle bugs which are hard to figure out.
这可能有点令人惊讶。至少根据此代码,您有一个 NUMBER (nValue) 既小于或等于零又大于零。发生这种情况的原因是 nValue 实际上是 NULL,并且所有与 NULL 的比较都会导致 UNKNOWN 而不是 TRUE 或 FALSE。这可能会导致难以弄清楚的细微错误。
Share and enjoy.
分享和享受。
回答by Walter Mitty
It depends on the situation, but it's really ultimately simple. Which one is closer to the truth?
这取决于具体情况,但最终真的很简单。哪一个更接近真相?
A lot of people deal with data as though it's just data, and truth doesn't matter. However, whenever you talk to the stakeholders in the data, you find that truth always matters. sometimes more, sometimes less, but it always matters.
很多人处理数据就好像它只是数据一样,而事实并不重要。但是,每当您与数据中的利益相关者交谈时,您都会发现事实总是很重要。有时更多,有时更少,但它总是很重要。
A default value is useful when you may presume that if the user (or other data source) had provided a value, the value would have been the default. If this presumption does more harm then good, then NULL is better, even though dealing with NULL is a pain in SQL.
当您假定如果用户(或其他数据源)提供了一个值,则该值将是默认值时,默认值很有用。如果这个假设弊大于利,那么 NULL 更好,即使处理 NULL 在 SQL 中是一种痛苦。
Note that there are three different ways default values can be implemented. First, in the application, before inserting new data. The database never sees the difference between a default value provided by the user or one provided by the app!
请注意,可以通过三种不同的方式实现默认值。首先,在应用程序中,插入新数据之前。数据库永远不会看到用户提供的默认值与应用程序提供的默认值之间的区别!
Second, by declaring a default value for the column, and leaving the data missing in an insert.
其次,通过声明列的默认值,并在插入中留下缺失的数据。
Third, by substituting the default value at retrieval time, whenever a NULL is detected. Only a few DBMS products permit this third mode to be declared in the database.
第三,通过在检索时替换默认值,只要检测到 NULL。只有少数 DBMS 产品允许在数据库中声明这第三种模式。
In an ideal world, data is never missing. If you are developing for the real world, required data will eventually be missing. Your applications can either do something that makes sense or something that doesn't make sense when that happens.
在理想的世界中,数据永远不会丢失。如果您正在为现实世界进行开发,则最终会丢失所需的数据。您的应用程序可以做一些有意义的事情,也可以在发生这种情况时做一些没有意义的事情。
回答by HLGEM
Null
s and default values are different things used for different purposes. If you are trying to avoid using null
s by giving everything a default value, that is a poor practice as I will explain.
Null
s 和默认值是用于不同目的的不同东西。如果您试图null
通过为所有内容提供默认值来避免使用s,那么正如我将解释的那样,这是一种糟糕的做法。
Null
means we do not know what the value is or will be. For instance suppose you have an enddate
field. You don't know when the process being recorded will end, so null
is the only appropriate value; using a default value of some fake date way out in the future will cause as much trouble to program around as handling the null
s and is more likely in my experience to create a problem with incorrect results being returned.
Null
意味着我们不知道价值是什么或将会是什么。例如,假设您有一个enddate
字段。你不知道被记录的进程什么时候结束,所以null
是唯一合适的值;将来使用某些假日期方法的默认值将导致与处理null
s一样多的编程麻烦,并且在我的经验中更有可能产生返回不正确结果的问题。
Now there are times when we might know what the value should be if the person inserting the record does not. For instance, if you have a date inserted
field, it is appropriate to have a default value of the current date and not expect the user to fill this in. You are likely to actually have better information that way for this field.
现在有时我们可能知道如果插入记录的人不知道该值应该是多少。例如,如果您有一个date inserted
字段,则使用当前日期的默认值而不期望用户填写它是合适的。对于该字段,您实际上可能会以这种方式获得更好的信息。
Sometimes, it's a judgement call and depends on the business rules you have to apply. Suppose you have a speaker honoraria
field (Which is the amount a speaker would get paid). A default value of 0
could be dangerous as it it might mean that speakers are hired and we intend to pay them nothing. It is also possible that there may occasionally be speakers who are donating their time for a particular project (or who are employees of the company and thus not paid extra to speak) where zero is a correct value, so you can't use zero as the value to determine that you don't know how much this speaker is to be paid. In this case Null
is the only appropriate value and the code should trigger an issue if someone tries to add the speaker to a conference. In a different situation, you may know already that the minimum any speaker will be paid is 3000 and that only speakers who have negotiated a different rate will have data entered in the honoraria
field. In this case, it is appropriate to put in a default value of 3000
. In another cases, different clients may have different minimums, so the default should be handled differently (usually through a lookup table that automatically populates the minimum honoraria
value for that client on the data entry form.
有时,这是一种判断,取决于您必须应用的业务规则。假设您有一个speaker honoraria
字段(即演讲者将获得的报酬)。默认值0
可能是危险的,因为这可能意味着扬声器已被雇用而我们打算不支付任何费用。也有可能偶尔会有演讲者为特定项目(或公司的员工,因此没有支付额外费用)为特定项目贡献时间,其中零是正确的值,因此您不能使用零作为确定你不知道这个扬声器要付多少钱的价值。在这种情况下Null
是唯一合适的值,如果有人尝试将发言人添加到会议,代码应该会触发问题。在另一种情况下,您可能已经知道,任何演讲者的最低报酬是 3000,并且只有协商了不同费率的演讲者才能在honoraria
字段中输入数据。在这种情况下,放置一个默认值是合适的3000
。在另一种情况下,不同的客户端可能有不同的最小值,因此应以不同方式处理默认值(通常通过查找表自动填充honoraria
数据输入表单上该客户端的最小值。
So I feel the best rule is leave the value as null
if you truly cannot know at the time the data is entered what the value of the field should be. Use a default value only it is has meaning all the time for that particular situation and use some other technique to fill in the value if it could be different under different circumstances.
所以我觉得最好的规则是保留值,就null
好像您在输入数据时确实不知道该字段的值应该是什么一样。使用默认值,只有它在特定情况下始终具有意义,如果在不同情况下它可能不同,则使用其他一些技术来填充该值。
回答by adolf garlic
In a Data Warehouse, you would always want to have default values rather than NULLs.
在数据仓库中,您总是希望拥有默认值而不是 NULL。
Instead you would have value such as "unknown","not ready","missing"
相反,您将拥有诸如“未知”、“未准备好”、“失踪”之类的值
This allows INNER JOINs to be performed efficiently on the Fact and Dimension tables as 'everything always has a value'
这允许在 Fact 和 Dimension 表上有效地执行 INNER JOIN,因为“一切总是有一个值”
回答by Diana Bodell
I so appreciate all of this discussion. I am in the midst of building a data warehouse and am using the Kimball model rather strictly. There is one very vocal user, however, who hates surrogate keys and wants NULLs all over the place. I told him that it is OK to have NULLable columns for attributes of dimensions and for any dates or numbers that are used in calculations because default values there imply incorrect data. There are, I agree, advantages to allowing NULL in certain columns but it makes cubing a lot better and more reliable if there is a surrogate key for every foreign key to a dimension, even if that surrogate is -1 or 0 for a dummy record. SQL likes integers for joins and if there is a missing dimension value and a dummy is provided as a surrogate key, then you will get the same number of records using one dimension as you would cubing on another dimension. However, calculations have to be done correctly and you have to accommodate for NULL values in those. Birthday should be NULL so that age is not calculated, for example. I believe in good data governance and making these decisions with the users forces them to think about their data in more ways than ever.
我非常感谢所有这些讨论。我正在构建一个数据仓库,并且相当严格地使用 Kimball 模型。然而,有一个非常直言不讳的用户讨厌代理键并希望到处都是 NULL。我告诉他,对于维度属性以及计算中使用的任何日期或数字,可以使用 NULLable 列,因为那里的默认值意味着数据不正确。我同意,在某些列中允许 NULL 是有好处的,但是如果维度的每个外键都有一个代理键,即使该代理是 -1 或 0 的虚拟记录,它也会使立方更好,更可靠. SQL 喜欢使用整数进行连接,并且如果缺少维度值并且提供了一个哑元作为代理键,那么您将使用一个维度获得与在另一个维度上进行立方体相同数量的记录。但是,计算必须正确完成,并且您必须适应其中的 NULL 值。例如,生日应为 NULL,以便不计算年龄。我相信良好的数据治理并与用户一起做出这些决定迫使他们以比以往更多的方式考虑他们的数据。