SQL varchar(500) 比 varchar(8000) 有优势吗?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2009694/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 04:57:39  来源:igfitidea点击:

is there an advantage to varchar(500) over varchar(8000)?

sqlsql-servertsql

提问by jcollum

I've read up on this on MSDN forums and here and I'm still not clear. I think this is correct: Varchar(max) will be stored as a text datatype, so that has drawbacks. So lets say your field will reliably be under 8000 characters. Like a BusinessName field in my database table. In reality, a business name will probably always be under (pulling a number outta my hat) 500 characters. It seems like plenty of varchar fields that I run across fall well under the 8k character count.

我已经在 MSDN 论坛和这里阅读了相关内容,但我仍然不清楚。我认为这是正确的: Varchar(max) 将存储为文本数据类型,因此有缺点。因此,假设您的字段可靠地少于 8000 个字符。就像我的数据库表中的 BusinessName 字段。实际上,公司名称可能总是低于(从我的帽子里拿出一个数字)500 个字符。我遇到的很多 varchar 字段似乎都低于 8k 字符数。

So should I make that field a varchar(500) instead of varchar(8000)? From what I understand of SQL there's no difference between those two. So, to make life easy, I'd want to define all my varchar fields as varchar(8000). Does that have any drawbacks?

那么我应该将该字段设为 varchar(500) 而不是 varchar(8000) 吗?从我对 SQL 的理解来看,这两者之间没有区别。因此,为了方便起见,我想将所有 varchar 字段定义为 varchar(8000)。这有什么缺点吗?

Related: Size of varchar columns(I didn't feel like this one answered my question).

相关:varchar 列的大小(我觉得这个没有回答我的问题)。

采纳答案by BBlake

From a processing standpoint, it will not make a difference to use varchar(8000) vs varchar(500). It's more of a "good practice" kind of thing to define a maximum length that a field should hold and make your varchar that length. It's something that can be used to assist with data validation. For instance, making a state abbreviation be 2 characters or a postal/zip code as 5 or 9 characters. This used to be a more important distinction for when your data interacted with other systems or user interfaces where field length was critical (e.g. a mainframe flat file dataset), but nowadays I think it's more habit than anything else.

从处理的角度来看,使用 varchar(8000) 和 varchar(500) 没有区别。定义字段应容纳的最大长度并使您的 varchar 成为该长度,这更像是一种“良好实践”。它可用于协助数据验证。例如,将州缩写设为 2 个字符或将邮政编码设为 5 或 9 个字符。当您的数据与字段长度至关重要的其他系统或用户界面(例如大型机平面文件数据集)交互时,这曾经是一个更重要的区别,但现在我认为它比其他任何东西都更习惯。

回答by Martin Smith

One example where this can make a difference is that it can prevent a performance optimization that avoids adding row versioning information to tables with after triggers.

这可以产生影响的一个示例是,它可以防止性能优化,避免将行版本信息添加到具有后触发器的表中。

This is covered by SQL Kiwi here

SQL Kiwi 在这里介绍了这一点

The actual size of the data stored is immaterial – it is the potential size that matters.

存储数据的实际大小并不重要——重要的是潜在大小。

Similarly if using memory optimised tables since 2016 it has been possible to use LOB columns or combinations of column widths that could potentially exceed the inrow limit but with a penalty.

类似地,如果自 2016 年以来使用内存优化表,则可以使用 LOB 列或列宽的组合,这些列可能会超过 inrow 限制,但会受到惩罚。

(Max) columns are always stored off-row. For other columns, if the data row size in the table definition can exceed 8,060 bytes, SQL Server pushes largest variable-length column(s) off-row. Again, it does not depend on amount of the data you store there.

(Max) 列始终存储在行外。对于其他列,如果表定义中的数据行大小可以超过 8,060 字节,则 SQL Server 将最大的可变长度列推送到行外。同样,它不取决于您存储在那里的数据量。

This can have a large negative effect on memory consumption and performance

这会对内存消耗和性能产生很大的负面影响

Another case where over declaring column widths can make a big difference is if the table will ever be processed using SSIS. The memory allocated for variable length (non BLOB) columns is fixed for each row in an execution tree and is per the columns' declared maximum length which can lead to inefficient usage of memory buffers (example). Whilst the SSIS package developer can declare a smaller column size than the source this analysis is best done up front and enforced there.

过度声明列宽会产生很大差异的另一种情况是表是否会使用 SSIS 进行处理。为可变长度(非 BLOB)列分配的内存对于执行树中的每一行都是固定的,并且是根据列声明的最大长度分配的,这会导致内存缓冲区的低效使用(示例)。虽然 SSIS 包开发人员可以声明比源更小的列大小,但最好预先完成此分析并在那里强制执行。

Back in the SQL Server engine itself a similar case is that when calculating the memory grant to allocate for SORToperations SQL Server assumes that varchar(x)columns will on average consume x/2bytes.

回到 SQL Server 引擎本身,类似的情况是,在计算为SORT操作分配的内存授予时,SQL Server 假定varchar(x)列平均消耗x/2字节。

If most of your varcharcolumns are fuller than that this can lead to the sortoperations spilling to tempdb.

如果您的大多数varchar列都比这更满,这可能会导致sort操作溢出到tempdb.

In your case if your varcharcolumns are declared as 8000bytes but actually have contents much less than that your query will be allocated memory that it doesn't require which is obviously inefficient and can lead to waits for memory grants.

在您的情况下,如果您的varchar列被声明为8000字节但实际上内容远少于您的查询将分配的内存,这显然是低效的,并且可能导致等待内存授予。

This is covered in Part 2 of SQL Workshops Webcast 1 downloadable from hereor see below.

这在 SQL Workshops 网络广播 1 的第 2 部分有介绍,可从此处下载或参见下文。

use tempdb;

CREATE TABLE T(
id INT IDENTITY(1,1) PRIMARY KEY,
number int,
name8000 VARCHAR(8000),
name500 VARCHAR(500))

INSERT INTO  T 
(number,name8000,name500)
SELECT number, name, name /*<--Same contents in both cols*/
FROM master..spt_values

SELECT id,name500
FROM T
ORDER BY number

Screenshot

截屏

SELECT id,name8000
FROM T
ORDER BY number

Screenshot

截屏

回答by gbn

Apart from best practices (BBlake's answer)

除了最佳实践(BBlake 的回答)

  • You get warnings about maximum row size (8060) bytes and index width (900 bytes) with DDL
  • DML will die if you exceed these limits
  • ANSI PADDING ON is the default so you could end up storing a wholeload of whitespace
  • 使用 DDL 会收到有关最大行大小 (8060) 字节和索引宽度(900 字节)的警告
  • 如果您超过这些限制,DML 将死亡
  • ANSI PADDING ON 是默认设置,因此您最终可能会存储大量空白

回答by Oliver

There are some disadvantages to large columns that are a bit less obvious and might catch you a little later:

大列有一些不太明显的缺点,稍后可能会发现:

  • All columns you use in an INDEX- must not exceed 900 bytes
  • All the columns in an ORDER BYclause may not exceed 8060 bytes. This is a bit difficult to grasp since this only applies to some columns. See SQL 2008 R2 Row size limit exceededfor details)
  • If the total row size exceeds 8060 bytes, you get a "page spill" for that row. This might affect performance (A page is an allocation unit in SQLServer and is fixed at 8000 bytes+some overhead. Exceeding this will not be severe, but it's noticable and you should try to avoid it if you easily can)
  • Many other internal datastructures, buffers and last-not-least your own varaibles and table-variables all need to mirror these sizes. With excessive sizes, excessive memory allocationcan affect performance
  • 您在INDEX 中使用的所有列- 不得超过 900 字节
  • ORDER BY子句中的所有列不得超过 8060 字节。这有点难以理解,因为这仅适用于某些列。有关详细信息,请参阅SQL 2008 R2 行大小限制超出
  • 如果总行大小超过 8060 字节,则会出现该行的“页面溢出”。这可能会影响性能(页面是 SQLServer 中的一个分配单元,固定为 8000 字节+一些开销。超过这个不会很严重,但很明显,如果可以的话,你应该尽量避免它)
  • 许多其他内部数据结构、缓冲区和最后一点你自己的变量和表变量都需要反映这些大小。如果大小过大,过多的内存分配会影响性能

As a general rule, try to be conservative with the column width. If it becomes a problem, you can easily expand it to fit the needs. If you notice memory issues later, shrinking a wide column later may become impossible without losing data and you won't know where to begin.

作为一般规则,尽量保守列宽。如果它成为一个问题,您可以轻松扩展它以满足需求。如果您稍后注意到内存问题,则以后可能无法在不丢失数据的情况下缩小宽列,并且您将不知道从哪里开始。

In your example of the business names, think about where you get to display them. Is there really space for 500 characters?? If not, there is little point in storing them as such. http://en.wikipedia.org/wiki/List_of_companies_of_the_United_Stateslists some company names and the max is about 50 characters. So I'd use 100 for the column max. Maybe more like 80.

在您的公司名称示例中,请考虑在何处显示它们。真的有500个字符的空间吗??如果没有,那么将它们存储起来就没有意义了。http://en.wikipedia.org/wiki/List_of_companies_of_the_United_States列出了一些公司名称,最多 50 个字符。所以我会使用 100 作为列最大值。也许更像是80。

回答by Otis

Ideally you'd want to go smaller than that, down to a reasonably sized length (500 isn't reasonably sized) and make sure the client validation catches when the data is going to be too large and send a useful error.

理想情况下,您希望比这更小,降低到合理大小的长度(500 不是合理大小),并确保客户端验证在数据太大时捕获并发送有用的错误。

While the varchar isn't actually going to reserve space in the database for the unused space, I recall versions of SQL Server having a snit about database rows being wider than some number of bytes (do not recall the exact count) and actually throwing out whatever data didn't fit. A certain number of those bytes were reserved for things internal to SQL Server.

虽然 varchar 实际上不会为未使用的空间保留数据库中的空间,但我记得 SQL Server 的版本对数据库行比某些字节数(不记得确切计数)宽并且实际上抛出任何不适合的数据。其中一定数量的字节是为 SQL Server 内部的事物保留的。