用于 sql 表中“状态”列的类型
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/7205391/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Type to use for "Status" columns in a sql table
提问by XwipeoutX
I have a (dummy) table structure as follows:
我有一个(虚拟)表结构如下:
ticket id: int(11) PK name: varchar(255) status: ?????????
The question is, what data type should I use for status? Here are my options, as I see them:
问题是,我应该为状态使用什么数据类型?这是我的选择,正如我所见:
- varchar representing the status - BAD because there's no integrity
- enum representing the status - BAD because to change the value, I'd have to alter the table, and then any code with dropdowns for the values, etc etc etc
- int FK to a status table - GOOD because it's dynamic, BAD because it's harder to inspect by sight (which may be useful)
- varchar FK to a status table - GOOD because it's dynamic, and visible on inspection. BAD because the keys are meaningful, which is generally frowned upon. Interestingly, in this case it's entirely possible for the status table to have just 1 column, making it a glorified enum
- varchar 表示状态 - 不好,因为没有完整性
- 代表状态的枚举 - 不好,因为要更改值,我必须更改表,然后任何带有值下拉列表的代码等等
- int FK 到状态表 - 好的因为它是动态的,坏的因为它更难通过视觉检查(这可能有用)
- varchar FK 到状态表 - 很好,因为它是动态的,并且在检查时可见。不好,因为键是有意义的,这通常是不受欢迎的。有趣的是,在这种情况下,状态表完全有可能只有 1 列,使其成为美化的枚举
Have I got an accurate read of the situation? Is having a meaningful key really that bad? Because while it does give me goosebumps, I don't have any reason for it doing so...
我对情况有准确的了解吗?拥有一个有意义的钥匙真的那么糟糕吗?因为虽然它确实让我起鸡皮疙瘩,但我没有任何理由这样做......
Update:For option 4, the proposed structure would be status: char(4) FK, to a status table. So,
更新:对于选项 4,建议的结构是status: char(4) FK,到状态表。所以,
OPEN => "Open"
打开 => "打开"
CLOS => "Closed"
CLOS =>“关闭”
"PEND" => "Pending Authorization"
"PEND" => "待授权"
"PROG" => "In Progress
"PROG" => "进行中
What's the disadvantage in this case ? The only benefit I can see of using int over char in this case is slight performance.
在这种情况下有什么缺点?在这种情况下,我可以看到在 char 上使用 int 的唯一好处是性能较低。
采纳答案by Ian Jacobs
Go with number 3. Create a view that join's in the status value if you want something inspectable.
使用数字 3。如果您想要一些可检查的东西,请创建一个加入状态值的视图。
回答by Philip Kelley
I would go with number 4, but I'd use a char(x)
column. If you're worried about performance, a char(4) takes up as much space (and, or so one would think, disk i/o, bandwidth, and processing time) as an int, which also takes 4 bytes to store. If you're really worried about performance, make it a char(2) or even char(1).
我会选择 4,但我会使用一char(x)
列。如果您担心性能,char(4) 会占用与 int 一样多的空间(以及,或者说,磁盘 i/o、带宽和处理时间),而 int 也需要 4 个字节来存储。如果您真的很担心性能,请将其设为 char(2) 甚至 char(1)。
Don't think of it as "meaningful data", think of it as an abbreviation of the natural key. Yes, the data has meaning, but as you've noticed that can be a good thing when working with the data--it means you don't always have to join (even if to a trivially small table) to extract meaning from the database. And of course the foreign key constraint ensures that the data is valid, since it must be in the lookup table. (This can be done with CHECK constraints as well, but Lookup tables are generally easier to manage and maintain over time.)
不要将其视为“有意义的数据”,而应将其视为自然键的缩写。是的,数据是有意义的,但正如您已经注意到的那样,在处理数据时这可能是一件好事——这意味着您不必总是加入(即使是一个微不足道的小表)来从数据中提取意义数据库。当然,外键约束确保数据有效,因为它必须在查找表中。(这也可以通过 CHECK 约束来完成,但随着时间的推移,查找表通常更容易管理和维护。)
The downside is that you can get caught up with trying to find meaning. char(1) has a strong appeal, but if you get to ten or more values, it can get hard to come up with goodmeaningful values. Less of a problem with char(4), but still a possible issue. Another downside: if the data is likely to change, then yes, your meaningful data ("PEND" = "Pending Authorization") can lose its meaning ("PEND" = "Forward to home office for initial approval"). That's a poor example; if codes like that do change, you're probably much better off refactoring your system to reflect the change in business rules. I guess my point should be, if it's a user-entered lookup value, surrogate keys (integers) will be your friend, but if they're internally defined and maintained you should definitely consider more human-friendly values. That, or you'll need post-em notes on your monitor to remind you what the heck Status = 31 is supposed to mean. (I've got three on mine, and the stickum wears out every few months. Talk about cost to maintain...)
不利的一面是,您可能会忙于寻找意义。char(1) 有很强的吸引力,但如果你得到十个或更多的值,就很难想出好的有意义的价值观。char(4) 的问题不大,但仍然是一个可能的问题。另一个缺点:如果数据可能会发生变化,那么是的,您的有意义的数据(“PEND”=“Pending Authorization”)可能会失去意义(“PEND”=“转发到家庭办公室进行初始批准”)。这是一个糟糕的例子;如果这样的代码确实发生了变化,那么重构系统以反映业务规则的变化可能会好得多。我想我的观点应该是,如果它是用户输入的查找值,代理键(整数)将是您的朋友,但如果它们是内部定义和维护的,您绝对应该考虑更人性化的值。否则,您将需要在显示器上添加即时笔记来提醒您 Status = 31 到底是什么意思。(我有三个,而且粘性每隔几个月就会磨损一次。谈谈维护成本...)
回答by James Johnson
I would use an INT, and create a foreign key relationship to the status table. An INT should definitely be safe for an enumerated status column.
我会使用 INT,并创建到状态表的外键关系。对于枚举状态列,INT 绝对应该是安全的。
回答by Jon Martin
May I recommend you go with a statusID field instead, and have a separate table mapping the ID to a varchar?
我可以建议您改用 statusID 字段,并使用一个单独的表将 ID 映射到 varchar 吗?
EDIT: I guess that's exactly what you outlined in point 3. I think that is the best option.
编辑:我想这正是您在第 3 点中概述的内容。我认为这是最好的选择。
回答by Neville Kuyt
I'm assuming that your database has a front end of some description, and that regular users are not exposed to the status code.
我假设您的数据库有一些描述的前端,并且普通用户不会看到状态代码。
So, your convenience is only for programmers and DBAs - important people, but I wouldn't optimize my design for them.
因此,您的方便仅适用于程序员和 DBA——重要的人,但我不会为他们优化我的设计。
Stronger - I would be very careful of using "meaningful" abbreviations - the most egregious data foul-up I've ever seen happened when a developer was cleansing some data, and interpreted the "meaningful" key incorrectly; turns out that "PROG" does not mean "programmed", but "in progress".
更强——我会非常小心地使用“有意义”的缩写——我见过的最严重的数据混乱发生在开发人员清理一些数据时,并且错误地解释了“有意义”的键;事实证明,“PROG”并不意味着“已编程”,而是“进行中”。
Go with option 3.
选择选项 3。
回答by kta
Creating a separate table with status is a good idea when you want to show the list of the status in the HTML form. You can show the verbose description from the lookup table and it will help the user to choose status if the requirements are like that.
当您想在 HTML 表单中显示状态列表时,创建一个带有状态的单独表格是一个好主意。您可以从查找表中显示详细描述,如果要求是这样,它将帮助用户选择状态。
From the development perspective, I would like to go integer as a primary key. You can optimize it by using small/tiny integer if you know it will not exceed the limit.
从开发的角度来看,我想将整数作为主键。如果您知道它不会超过限制,您可以使用小/小整数对其进行优化。
If you use abbreviation as a foreign key then you have to think every time to make it unique all the time as @Philip Kelley had mentioned it as a downside of it.
如果您使用缩写作为外键,那么您必须每次都考虑使它始终独一无二,因为@Philip Kelley 曾提到它是它的一个缺点。
Lastly, you can declare the table type MYISAM if you like.
最后,您可以根据需要声明表类型 MYISAM。
Update: Reflecting @Philip Kelley opinion, if there are too many status, then it's better to use integer as foreign key. If there are only couple of status, then may be use abbr as a foreign key.
更新:反映@Philip Kelley 的观点,如果状态太多,那么最好使用整数作为外键。如果只有几个状态,那么可以使用 abbr 作为外键。
回答by DevBodin
I've been working with a lot of databases recently that require a lotof statuses AND I've got a few notes that might be worth adding to the conversation.
我最近一直在处理很多需要很多状态的数据库,我有一些笔记可能值得添加到对话中。
INT: One thing I found is that if an application has a lot of tracking going on, the number of reference tables can quickly get unwieldy and, as you've mentioned, make inspecting the database at a glance impractical. (Which, for some of my clients, has mattered much more than the scant milliseconds it's saved in processing time.)
INT:我发现的一件事是,如果一个应用程序有很多跟踪正在进行,那么引用表的数量很快就会变得难以处理,正如您所提到的,一目了然地检查数据库是不切实际的。(对于我的一些客户来说,这比在处理时间中节省的几毫秒更重要。)
VARCHAR: Terrible idea for programming, but it's important to consider if a given status is actually going to be used by the code, or just human eyes. For the latter, you get unlimited range and don't have to maintain any relationships.
VARCHAR:编程的可怕想法,但重要的是要考虑给定的状态是否真的会被代码使用,或者只是人眼。对于后者,您可以获得无限的范围并且不必维持任何关系。
CHAR(4): Using a descriptive char column can actually be a very good approach. I'd typically only consider it if the value range were going to be low and obvious, but only because I consider this a nonstandard approach (risking confusion to new devs). Realistically, you could use a CHAR value as a foreign key just the same as an INT, gain legibility and maintain performance parity.
CHAR(4):使用描述性字符列实际上是一个很好的方法。我通常只在值范围较低且明显时才考虑它,但这只是因为我认为这是一种非标准方法(冒着让新开发人员感到困惑的风险)。实际上,您可以将 CHAR 值用作与 INT 相同的外键,以获得易读性并保持性能均衡。
The one thing you couldn't do that I'd miss is mathematical operations (like "<" and ">").
你不能做但我会错过的一件事是数学运算(如“<”和“>”)。
INT Range: A hybrid strategy I've tried out is to use INT, but adding a degree of semantics to the numbers. So, for instance,
INT Range:我尝试过的一种混合策略是使用 INT,但为数字添加了一定程度的语义。所以,例如,
1-10 being for initial stages,
11-20 being in progress, and
21-30 being the final stages.
60-69 for errors, rejections
The problem here is that if you discover you need more numbers, you're SOL, since the next range is already taken. So, what I ended up doing was (sort of) mimicking HTTP responses:
这里的问题是,如果你发现你需要更多的数字,你就是 SOL,因为下一个范围已经被占用了。所以,我最终做的是(有点)模仿 HTTP 响应:
100-199 being for initial stages,
200-299 being in progress, and
300-399 being the final stages.
500-599 for errors, rejections
I prefer this to simple INT, and while it canbe less descriptive than CHAR, it can also be less ambiguous. Whereas "PROG" could mean a number of things, good, bad or benign, if I can see something is in the 500 range, I may not known what the problem is, I will be able to tell you there isa problem.
我喜欢这种简单的INT,虽然它可以比CHAR是描述性较差,它也可以更加明确。而“PROG”可能意味着一些东西,好的,坏的还是良性的,如果我能看到的东西是在500的范围内,我可能不知道是什么问题,我就可以告诉你,是一个问题。