SQL SELECT 速度 int 与 varchar

Question

提问by googletorp

I'm in the process of creating a table and it made me wonder.

我正在创建一个表，这让我感到奇怪。

If I store, say cars that has a make (fx BMW, Audi ect.), will it make any difference on the query speed if I store the make as an int or varchar.

如果我存储，比如说有品牌的汽车（fx 宝马，奥迪等），如果我将品牌存储为 int 或 varchar，它会对查询速度产生任何影响。

So is

也是

SELECT * FROM table WHERE make = 5 AND ...;

Faster/slower than

快/慢于

SELECT * FROM table WHERE make = 'audi' AND ...;

or will the speed be more or less the same?

或者速度会或多或少相同？

Answer 1

采纳答案by Robert Munteanu

Int comparisons are faster than varchar comparisons, for the simple fact that ints take up much less space than varchars.

int 比较比 varchar 比较快，因为 int 比 varchar 占用的空间少得多。

This holds true both for unindexed and indexed access. The fastest way to go is an indexed int column.

这对于未索引和索引访问都适用。最快的方法是索引 int 列。

As I see you've tagged the question postgreql, you might be interested in the space usage of different date types:

正如我看到您已将问题标记为 postgreql，您可能对不同日期类型的空间使用感兴趣：

intfields occupy between 2 and 8 bytes, with 4 being usually more than enough ( -2147483648 to +2147483647 )
character types occupy 4 bytes plus the actual strings.

int字段占用2 到 8 个字节，通常 4 个就足够了（ -2147483648 到 +2147483647 ）
字符类型占用4 个字节加上实际的字符串。

Answer 2

回答by Grzegorz Luczywo

Some rough benchmarks:

一些粗略的基准：

4 million records in Postgres 9.x

Postgres 9.x 中的 400 万条记录

Table A = base table with some columns
Table B = Table A + extra column id of type bigint with random numbers
Table C = Table A + extra column id of type text with random 16-char ASCII strings

Results on 8GB RAM, i7, SSD laptop:

在 8GB RAM、i7、SSD 笔记本电脑上的结果：

Size on disk:                A=261MB        B=292MB        C=322MB
Non-indexed by id: select count(*), select by id: 450ms same on all tables
Insert* one row per TX:       B=9ms/record        C=9ms/record
Bulk insert* in single TX:    B=140usec/record    C=180usec/record
Indexed by id, select by id:  B=about 200us       C=about 200us

* inserts to the table already containing 4M records

so it looks like for this setup, as long as your indexes fit in RAM, bigint vs 16-char text makes no difference in speed.

所以看起来对于这个设置，只要你的索引适合 RAM，bigint 与 16-char 文本在速度上没有区别。

Answer 3

回答by Guffa

It will be a bit faster using an int instead of a varchar. More important for speed is to have an index on the field that the query can use to find the records.

使用 int 而不是 varchar 会快一点。对速度来说更重要的是在查询可以用来查找记录的字段上有一个索引。

There is another reason to use an int, and that is to normalise the database. Instead of having the text 'Mercedes-Benz' stored thousands of times in the table, you should store it's id and have the brand name stored once in a separate table.

使用 int 的另一个原因是标准化数据库。与其在表中存储数千次文本“Mercedes-Benz”，您应该存储它的 id 并将品牌名称存储在单独的表中一次。

Answer 4

回答by Ozz Nixon

Breaking down to the actual performance of string comparison versus non-floats, in this case any size unsigned and signed does not matter. Size is actually the true difference in performance. Be it 1byte+(up to 126bytes) versus 1,2,4 or 8 byte comparison... obviously non-float are smaller than strings and floats, and thus more CPU friendly in assembly.

分解为字符串比较与非浮点数的实际性能，在这种情况下，任何大小无符号和有符号都无关紧要。大小实际上是性能的真正差异。无论是 1byte+（最多 126bytes）还是 1、2、4 或 8 字节的比较……显然非浮点数比字符串和浮点数小，因此在汇编中对 CPU 更友好。

String to string comparison in alllanguages is slower than something that can be compared in 1 instruction by the CPU. Even comparing 8 byte (64bit) on a 32bit CPU is still faster than a VARCHAR(2) or larger. * Again, look at the produced assembly (even by hand) it takes more instructions to compare char by char than 1 to 8 byte CPU numeric.

所有语言中的字符串与字符串的比较都比 CPU 可以在 1 条指令中进行比较的速度慢。即使在 32 位 CPU 上比较 8 字节（64 位）仍然比 VARCHAR(2) 或更大的更快。* 再次查看生成的程序集（即使是手工），逐个字符比较比 1 到 8 字节 CPU 数字需要更多指令。

Now, how much faster? depends also upon the volume of data. If you are simply comparing 5 to 'audi' - and that is all your DB has, the resulting difference is so minimal you would never see it. Depending upon CPU, implementation (client/server, web/script, etc) you probably will not see it until you hit few hundred comparisons on the DB server (maybe even a couple thousand comparisons before it is noticeable).

现在，要快多少？还取决于数据量。如果您只是将 5 与 'audi' 进行比较 - 这就是您的数据库所拥有的全部，那么由此产生的差异非常小，您将永远看不到它。根据 CPU、实现（客户端/服务器、Web/脚本等），您可能不会看到它，直到您在数据库服务器上进行了几百次比较（甚至可能是几千次比较才能引起注意）。

To void the incorrect dispute about hash comparisons. Most hashing algorithms themselves are slow, so you do not benefit from things like CRC64 and smaller. For over 12 years I developed search algorithms for multi-county search engines and 7 years for the credit bureaus. Anything you can keep in numeric the faster... for example phone numbers, zip codes, even currency * 1000 (storage) currency div 1000 (retrieval) is faster than DECIMAL for comparisons.

避免关于哈希比较的不正确争议。大多数散列算法本身都很慢，因此您不会从 CRC64 和更小的东西中受益。12 年来，我为多县搜索引擎开发了搜索算法，为信用局开发了 7 年。您可以更快地保留数字的任何内容...例如电话号码，邮政编码，甚至货币 * 1000（存储）货币 div 1000（检索）比 DECIMAL 更快进行比较。

Ozz

奥兹

Answer 5

回答by Konrad Garus

Index or not, int is a lot faster (the longer the varchar, the slower it gets).

索引与否，int 快得多（varchar 越长，它变得越慢）。

Another reason: index on varchar field will be much larger than on int. For larger tables it may mean hundreds of megabytes (and thousands of pages). That makes the performance much worse as reading the index alone requires many disk reads.

另一个原因：varchar 字段上的索引将比 int 大得多。对于较大的表，它可能意味着数百兆字节（和数千页）。这使得性能更差，因为单独读取索引需要多次磁盘读取。

Answer 6

回答by anthares

In general the int will be faster. The longer is the varchar the slower it gets

一般来说 int 会更快。varchar 越长，它变得越慢

Answer 7

回答by Thomas Schaub

Hint: If the possible values for the field makewill never(or rarely) change, you can use ENUM as a compromise. It combines good speed with good readability.

提示：如果该字段的可能值化妆将永远不会（或很少）改变，你可以使用ENUM作为妥协。它结合了良好的速度和良好的可读性。

Answer 8

回答by Sarfraz

If you turn on indexingon either of the fields, it will be faster. As for your question, i think intis faster than varchar.

如果您打开任一字段的索引，速度会更快。至于你的问题，我认为int比varchar.

Answer 9

回答by Alex

Somewhat relative. Yes, INTs will be faster, but the question is if it is noticeable in your situation. Are the VARCHARs just some small words, or longer texts? and how many rows are in the table? If there are just a few rows it will most likely be entirely buffered in memory (when requested often), in that case you wont notice much difference. Then of course there is indexing, which gets more important when the table grows. Using SSD's might be faster then HD's with optimized queries. Also good disk-controllers sometimes speed up queries >10x . This might leave room for just using VARCHARs which makes reading and writing queries easier (no need to write complex joins) and speed up development. Purists however will disagree and always normalize everything.

有点亲戚。是的，INT 会更快，但问题是它在您的情况下是否明显。VARCHAR 是一些小词还是较长的文本？表中有多少行？如果只有几行，它很可能会完全缓冲在内存中（经常请求时），在这种情况下，您不会注意到太大的差异。当然还有索引，当表增长时它变得更加重要。使用 SSD 可能比 HD 使用优化查询更快。同样好的磁盘控制器有时会加快查询速度 >10 倍。这可能为仅使用 VARCHAR 留出空间，这使得读写查询更容易（无需编写复杂的连接）并加快开发速度。然而，纯粹主义者会不同意并总是将一切正常化。

SQL SELECT 速度 int 与 varchar

提问by googletorp

采纳答案by Robert Munteanu

回答by Grzegorz Luczywo

回答by Guffa

回答by Ozz Nixon

回答by Konrad Garus

回答by anthares

回答by Thomas Schaub

回答by Sarfraz

回答by Alex

相关推荐

最近更新

标签

SQL SELECT 速度 int 与 varchar

提问by googletorp

采纳答案by Robert Munteanu

回答by Grzegorz Luczywo

回答by Guffa

回答by Ozz Nixon

回答by Konrad Garus

回答by anthares

回答by Thomas Schaub

回答by Sarfraz

回答by Alex

相关推荐

SQL TSQL 左连接并且仅从右起最后一行

如何执行作为 sp 参数传递的 sql 文本？

SQL 索引中列的顺序有多重要？

SQL 识别两个表之间差异的最快方法？

相关推荐

最近更新

标签