SQL 在数据库中存储性(性别)
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4175878/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Storing sex (gender) in database
提问by Marko
I want to store a user's gender in a database with as little (size/performance) cost as possible.
我想以尽可能少的(大小/性能)成本将用户的性别存储在数据库中。
So far, 3 scenarios come to mind
到目前为止,想到了 3 个场景
- Int- aligned with Enum in code (1 = Male, 2 = Female, 3 = ...)
- char(1)- Store m, for another single character identifier
- Bit(boolean)- is there an appropriate field name for this option?
- Int-与代码中的 Enum 对齐(1 = 男性,2 = 女性,3 = ...)
- char(1)-存储m、f或另一个单字符标识符
- 位(布尔值)-此选项是否有合适的字段名称?
The reason I ask is because of this answerwhich mentions that charsare smallerthan booleans.
我想问的原因是因为这个的答案,其提到字符是小比布尔值。
I should clarify that I'm using MS SQL 2008, which DOESin fact have the bit datatype.
我要澄清,我使用MS SQL 2008,它DOES其实有位数据类型。
采纳答案by OMG Ponies
I'd call the column "gender".
我将列称为“性别”。
Data Type Bytes Taken Number/Range of Values
------------------------------------------------
TinyINT 1 255 (zero to 255)
INT 4 - 2,147,483,648 to 2,147,483,647
BIT 1 (2 if 9+ columns) 2 (0 and 1)
CHAR(1) 1 26 if case insensitive, 52 otherwise
The BITdata type can be ruled out because it only supports two possible genders which is inadequate. While INTsupports more than two options, it takes 4 bytes -- performance will be better with a smaller/more narrow data type.
该BIT数据类型可以排除,因为它仅支持两种可能的性别是不够的。虽然INT支持两个以上的选项,但它需要 4 个字节——使用更小/更窄的数据类型性能会更好。
CHAR(1)
has the edge over TinyINT- both take the same number of bytes, but CHAR provides a more narrow number of values. Using CHAR(1)
would make using "m", "f",etc natural keys, vs the use of numeric data which are referred to as surrogate/artificial keys. CHAR(1)
is also supported on any database, should there be a need to port.
CHAR(1)
与TinyINT 相比具有优势- 两者都采用相同数量的字节,但 CHAR 提供的值数量更窄。使用CHAR(1)
将使用“m”、“f”等自然键,而不是使用被称为代理/人工键的数字数据。 CHAR(1)
如果需要移植,也支持任何数据库。
Conclusion
结论
I would use Option 2: CHAR(1).
我会使用选项 2:CHAR(1)。
Addendum
附录
An index on the gender column likely would nothelp because there's no value in an index on a low cardinality column. Meaning, there's not enough variety in the values for the index to provide any value.
在性别列的索引很可能将不会帮助,因为有一个在一个低基数列的索引没有价值。这意味着,索引值的多样性不足以提供任何值。
回答by Pondlife
There is already an ISO standard for this; no need to invent your own scheme:
已经有一个 ISO 标准;无需发明自己的方案:
http://en.wikipedia.org/wiki/ISO_5218
http://en.wikipedia.org/wiki/ISO_5218
Per the standard, the column should be called "Sex" and the 'closest' data type would be tinyint with a CHECK constraint or lookup table as appropriate.
根据标准,该列应称为“Sex”,“最接近”的数据类型将是 tinyint,并带有 CHECK 约束或适当的查找表。
回答by user207421
In medicine there are four genders: male, female, indeterminate, and unknown. You mightn't need all four but you certainly need 1, 2, and 4. It's not appropriate to have a default value for this datatype. Even less to treat it as a Boolean with 'is' and 'isn't' states.
在医学上有四种性别:男性、女性、不确定和未知。您可能不需要所有四个,但您肯定需要 1、2 和 4。为这种数据类型设置默认值是不合适的。更不用说将其视为具有“是”和“不是”状态的布尔值。
回答by Hugo
An Int
(or TinyInt
) aligned to an Enum
field would be my methodology.
与某个领域对齐的Int
(或TinyInt
)Enum
将是我的方法论。
First, if you have a single bit
field in a database, the row will still use a full byte, so as far as space savings, it only pays off if you have multiple bit
fields.
首先,如果bit
数据库中只有一个字段,该行仍将使用完整字节,因此就节省空间而言,只有当您有多个bit
字段时才值得。
Second, strings/chars have a "magic value" feel to them, regardless of how obvious they may seem at design time. Not to mention, it lets people store just about any value they would not necessarily map to anything obvious.
其次,字符串/字符对它们有一种“神奇价值”的感觉,不管它们在设计时看起来有多明显。更不用说,它可以让人们存储几乎任何他们不一定映射到任何明显的价值。
Third, a numeric value is much easier (and better practice) to create a lookup table for, in order to enforce referential integrity, and can correlate 1-to-1 with an enum, so there is parity in storing the value in memory within the application or in the database.
第三,数值更容易(和更好的实践)创建查找表,以强制执行参照完整性,并且可以将枚举与 1 对 1 相关联,因此在将值存储在内存中时存在奇偶校验应用程序或数据库中。
回答by zarac
I use char 'f', 'm' and 'u' because I surmise the gender from name, voice and conversation, and sometimes don't know the gender. The final determination is their opinion.
我使用字符 'f'、'm' 和 'u' 是因为我从姓名、声音和对话中推测性别,有时不知道性别。最后的决定是他们的意见。
It really depends how well you know the person and whether your criteria is physical form or personal identity. A psychologist might need additional options - cross to female, cross to male, trans to female, trans to male, hermaphrodite and undecided. With 9 options, not clearly defined by a single character, I might go with Hugo's advice of tiny integer.
这实际上取决于您对这个人的了解程度以及您的标准是身体形态还是个人身份。心理学家可能需要额外的选择——跨女性、跨男性、跨女性、跨男性、雌雄同体和未定。有 9 个选项,没有由单个字符明确定义,我可能会同意 Hugo 的小整数建议。
回答by ajacian81
Option 3 is your best bet, but not all DB engines have a "bit" type. If you don't have a bit, then TinyINT would be your best bet.
选项 3 是您最好的选择,但并非所有数据库引擎都具有“位”类型。如果你没有一点,那么 TinyINT 将是你最好的选择。
回答by HansLindgren
I would go with Option 3 but multiple NON NULLABLE bit columns instead of one. IsMale (1=Yes / 0=No) IsFemale (1=Yes / 0=No)
我会选择选项 3,但使用多个 NON NULLABLE 位列而不是一个。IsMale (1=是/0=否) IsFemale (1=是/0=否)
if requried: IsUnknownGender (1=Yes / 0=No) and so on...
如果需要:IsUnknownGender (1=Yes / 0=No) 等等...
This makes for easy reading of the definitions, easy extensibility, easy programability, no possibility of using values outside the domain and no requirement of a second lookup table+FK or CHECK constraints to lock down the values.
这使得易于阅读定义、易于扩展、易于编程、不可能使用域外的值,并且不需要第二个查找表+FK 或 CHECK 约束来锁定值。
回答by Mohammad Asif
CREATE TABLE Admission (
Rno INT PRIMARY KEY AUTO_INCREMENT,
Name VARCHAR(25) NOT NULL,
Gender ENUM('M','F'),
Boolean_Valu boolean,
Dob Date,
Fees numeric(7,2) NOT NULL
);
insert into Admission (Name,Gender,Boolean_Valu,Dob,Fees)values('Raj','M',true,'1990-07-12',50000);
insert into Admission (Name,Gender,Boolean_Valu,Dob,Fees)values('Rani','F',false,'1994-05-10',15000);
select * from admission;