vb.net 使用 INT 或 GUID 作为主键
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/21896782/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using INT or GUID as primary key
提问by A-Tech
I was trying to create an IDcolumn in SQL server, VB.net that would provide a sequence of numbers for every new row created in a database. So I used the following technique to create the ID column.
我试图ID在 SQL 服务器 VB.net 中创建一个列,它会为数据库中创建的每个新行提供一个数字序列。所以我使用以下技术来创建 ID 列。
select * from T_Users
ALTER TABLE T_Users
ADD User_ID INT NOT NULL IDENTITY(1,1) Primary Key
Then I registered few usernames into the database and it worked just fine. For example the first six rows would be 1,2,3,4,5,6. Then I registered 4 more users the NEXT day, but this time the ID numbers jumped from 6 to A very large number such as: 1,2,3,4,5,6,1002,1003,1004,1005. Then two days later, I registered two more users and the new rows read 3002,3004. So my question is why is it skipping such a large number every other day I register users. Is the technique I used to create the sequence wrong? If it is wrong can anyone please tell me how to do it right? Now as I was getting frustrated with the technique used above, alternatively I tried to use sequentially generated GUID values. The sequence of GUID values were generated fine. However, the only downside is, it generates a very long numbers (4 times the INT size). My question here is does using GUID have any significant advantage over INT?
然后我在数据库中注册了几个用户名,它工作得很好。例如,前六行是 1、2、3、4、5、6。然后我第二天又注册了4个用户,但是这次ID号从6个跳到了一个非常大的数字,例如:1、2、3、4、5、6、1002、1003、1004、1005。然后两天后,我又注册了两个用户,新行显示为 3002,3004。所以我的问题是为什么每隔一天我注册用户就会跳过这么大的数字。我用来创建序列的技术是错误的吗?如果它是错的,谁能告诉我如何正确地做?现在,当我对上面使用的技术感到沮丧时,我尝试使用顺序生成的 GUID 值。GUID 值的序列生成得很好。但是,唯一的缺点是,它会生成很长的数字(INT 大小的 4 倍)。
Regards,
问候,
回答by DeanOC
Upside of GUIDs:
GUID 的优点:
GUIDs are good if you ever want offline clients to be able to create new records, as you will never get a primary key clash when the new records are synchronised back to the main database.
如果您希望离线客户端能够创建新记录,GUID 是很好的,因为当新记录同步回主数据库时,您永远不会遇到主键冲突。
Downside of GUIDs:
GUID 的缺点:
GUIDS as primary keys can have an effect on the performance of the DB, because for a clustered primary key, the DB will want to keep the rows in order of the key values. But this means a lot of inserts between existing records, because the GUIDs will be random.
GUIDS 作为主键会对 DB 的性能产生影响,因为对于集群主键,DB 将希望按键值的顺序保留行。但这意味着在现有记录之间进行大量插入,因为 GUID 将是随机的。
Using IDENTITY column doesn't suffer from this because the next record is guaranteed to have the highest value and so the row is just tacked on the end every time. No re-shuffle needs to happen.
使用 IDENTITY 列不会受此影响,因为保证下一条记录具有最高值,因此每次都将行添加到末尾。不需要重新洗牌。
There is a compromise which is to generate a pseudo-GUID which means you would expect a key clash every 70 years or so, but helps the indexing immensely.
有一个折衷方案是生成一个伪 GUID,这意味着您预计每 70 年左右就会发生一次密钥冲突,但对索引有很大帮助。
The other downsides are that a) they do take up more storage space, and b) are a real pain to write SQL against, i.e. much easier to type UPDATE TABLE SET FIELD = 'value' where KEY = 50003than UPDATE TABLE SET FIELD = 'value' where KEY = '{F820094C-A2A2-49cb-BDA7-549543BB4B2C}'
其他的缺点是,一个),他们确实会占用更多的存储空间,和b)是一个真正的痛苦打击,即更容易输入写SQLUPDATE TABLE SET FIELD = 'value' where KEY = 50003比UPDATE TABLE SET FIELD = 'value' where KEY = '{F820094C-A2A2-49cb-BDA7-549543BB4B2C}'
Your declaration of the IDENTITY column looks fine to me. The gaps in your key values are probably due to failed attempts to add a row. The IDENTITY value will be incremented but the row never gets committed. Don't let it bother you, it happens in practically every table.
您对 IDENTITY 列的声明在我看来很好。键值中的差距可能是由于尝试添加行失败所致。IDENTITY 值将增加,但该行永远不会被提交。不要让它打扰你,它几乎发生在每张桌子上。
EDIT:
编辑:
This question covers what I was meaning by pseudo-GUID. INSERTs with sequential GUID key on clustered index not significantly faster
这个问题涵盖了我所说的伪 GUID 的含义。聚集索引上具有顺序 GUID 键的 INSERT 没有明显更快
In SQL Server 2005+ you can use NEWSEQUENTIALID() to get a random value that is supposed to be greater than the previous ones. See here for more info http://technet.microsoft.com/en-us/library/ms189786%28v=sql.90%29.aspx
在 SQL Server 2005+ 中,您可以使用 NEWSEQUENTIALID() 来获得一个应该大于以前的随机值。有关更多信息,请参见此处http://technet.microsoft.com/en-us/library/ms189786%28v=sql.90%29.aspx
回答by TomTom
Is the technique I used to create the sequence wrong?
我用来创建序列的技术是错误的吗?
No. If anything your google skills are non-existing. A short look for "Sql server identity skipping values" will give you a TON of returns including:
不。如果你的谷歌技能不存在的话。对“Sql server identity skipping values”的简短查找将为您提供大量的回报,包括:
SQL Server 2012 column identity increment jumping from 6 to 1000+ on 7th entry
SQL Server 2012 列标识增量在第 7 个条目时从 6 跃升至 1000+
and the canonical:
和规范:
Why are there gaps in my IDENTITY column values?
You basically wrongly assume sql server will not optimize it's access for performance. Identity numbers are markers, nothing else, no assumption of having no gaps please.
您基本上错误地认为 sql server 不会优化它的性能访问。身号码是标记,没有别的,请不要假设没有间隙。
In particular: SQL Server preallocates numbers in 1000 blocks and - if you restart the server (like on your workstation) the remainder is lost.
特别是:SQL Server 以 1000 个块为单位预先分配数字,并且 - 如果您重新启动服务器(如在您的工作站上),剩余部分将丢失。
If you do a manual sqyuence instead (new nin sql server 2012) you can define the cache size for this (pregeneration) and set it to 1 - at the cost of slightly lower performance when you do a lot of inserts.
如果您改为执行手动 sqyuence(新 nin sql server 2012),您可以为此(预生成)定义缓存大小并将其设置为 1 - 以执行大量插入时性能稍低为代价。
My question here is does using GUID have any significant advantage over INT?
我的问题是使用 GUID 是否比 INT 有任何显着优势?
Yes. You can have a lot more rows with GUID's than with int. For example, int32 is limited to about 2 billion rows. For some of us that is too low (I have tables in the 10 billion range) and even a 64 large int is limited. And a truly zetabyte database, you have to use a guid in sequence, self generated.
是的。使用 GUID 的行可以比使用 int 的行多得多。例如,int32 被限制为大约 20 亿行。对于我们中的一些人来说太低了(我的表在 100 亿范围内),甚至 64 个大整数也是有限的。而一个真正的 zetabyte 数据库,您必须按顺序使用自生成的 guid。
Any normal human does not see a difference as we all do not really deal with that many rows. And the larger size makes a lot of things slower (larger key size = larger space in indices = larger indices = more memory / io for the same operation). Plus even your sequential id will jump.
任何正常人都看不出有什么区别,因为我们都没有真正处理那么多行。更大的尺寸会使很多事情变慢(更大的键尺寸=更大的索引空间=更大的索引=相同操作的更多内存/io)。此外,即使您的顺序 ID 也会跳跃。
Why not just adjust your expectation to reality - identity is not meant to be without gaps - or use a sequence with cache 1.
为什么不只是将您的期望调整为现实 - 身份并不意味着没有差距 - 或者使用带有缓存 1 的序列。

