MySQL - 为什么不索引每个字段?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/5446124/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 19:18:36  来源:igfitidea点击:

MySQL - why not index every field?

mysqlsqlindexing

提问by Vael Victus

Recently I've learned the wonder of indexes, and performance has improved dramatically. However, with all I've learned, I can't seem to find the answer to this question.

最近我了解了索引的神奇之处,性能有了显着提高。但是,根据我所学的知识,我似乎无法找到这个问题的答案。

Indexes are great, but why couldn't someone just index all fields to make the table incredibly fast? I'm sure there's a good reason to not do this, but how about three fields in a thirty-field table? 10 in a 30 field? Where should one draw the line, and why?

索引很棒,但为什么有人不能只索引所有字段以使表非常快?我确信有一个很好的理由不这样做,但是 30 个字段的表中的三个字段怎么样?在 30 个领域中有 10 个?应该在哪里划线,为什么?

回答by Brian Roach

Indexes take up space in memory (RAM); Too many or too large of indexes and the DB is going to have to be swapping them to and from the disk. They also increase insert and delete time (each index must be updated for every piece of data inserted/deleted/updated).

索引占用内存(RAM)空间;索引过多或过大,数据库将不得不将它们与磁盘交换。它们还会增加插入和删除时间(必须为插入/删除/更新的每条数据更新每个索引)。

You don't have infinite memory. Making it so all indexes fit in RAM = good.

你没有无限的记忆。使所有索引都适合 RAM = 好。

You don't have infinite time. Indexing only the columns you need indexed minimizes the insert/delete/update performance hit.

你没有无限的时间。仅索引您需要索引的列可以最大限度地减少插入/删除/更新性能的影响。

回答by AndyMac

Keep in mind that every index must be updated any time a row is updated, inserted, or deleted. So the more indexes you have, the slower performance you'll have for write operations.

请记住,每次更新、插入或删除行时,都必须更新每个索引。因此,您拥有的索引越多,写入操作的性能就越慢。

Also, every index takes up further disk space and memory space (when called), so it could potentially slow read operations as well (for large tables). Check this out

此外,每个索引都会占用更多的磁盘空间和内存空间(在调用时),因此它也可能会减慢读取操作(对于大型表)。 看一下这个

回答by Smandoli

You have to balance CRUD needs. Writing to tables becomes slow. As for where to draw the line, that depends on how the data is being acessed (sorting filtering, etc.).

您必须平衡 CRUD 需求。写入表变得缓慢。至于在哪里划线,这取决于数据的访问方式(排序过滤等)。

回答by Lionel Jerinho

Indexing will take up more allocated space both from drive and ram, but also improving the performance a lot. Unfortunately when it reaches memory limit, the system will surrender the drive space and risk the performance. Practically, you shouldn't index any field that you might think doesn't involve in any kind of data traversing algorithm, neither inserting nor searching (WHERE clause). But you should if otherwise. By default you have to index all fields. The fields which you should consider unindexing is if the queries are used only by moderator, unless if they need for speed too

索引会从驱动器和内存中占用更多分配的空间,但也会大大提高性能。不幸的是,当它达到内存限制时,系统将放弃驱动器空间并危及性能。实际上,您不应该索引您认为不涉及任何类型的数据遍历算法的任何字段,无论是插入还是搜索(WHERE 子句)。但如果不是,你应该这样做。默认情况下,您必须索引所有字段。您应该考虑取消索引的字段是查询是否仅由主持人使用,除非他们也需要速度

回答by Rachid Sakara

It is not a good idea to indexes all the columns in a table. While this will make the table very fast to read from, it also becomes much slower to write to. Writing to a table that has every column indexed would involve putting the new record in that table and then putting each column's information in the its own index table.

索引表中的所有列并不是一个好主意。虽然这将使表的读取速度非常快,但写入速度也会变慢。写入每列都有索引的表将涉及将新记录放入该表中,然后将每列的信息放入其自己的索引表中。

回答by Mohammed Housseyn Taleb

this answer is my personal opinion based I m using my mathematical logic to answer

这个答案是我个人的意见,我用我的数学逻辑来回答

the second question was about the border where to stop, First let do some mathematical calculation, suppose we have N rows with L fields in a table if we index all the fields we will get a L new index tables where every table will sort in a meaningfull way the data of the index field, in first glance if your table is a W weight it will become W*2 (1 tera will become 2 tera) if you have 100 big table (I already worked in project where the table number was arround 1800 table ) you will waste 100 times this space (100 tera), this is way far from wise.

第二个问题是关于边界在哪里停止,首先让我们做一些数学计算,假设我们有 N 行,表中有 L 个字段,如果我们索引所有字段,我们将得到 L 个新索引表,其中每个表将按有意义的方式索引字段的数据,乍一看,如果您的表是 W 权重,它将变为 W*2(1 tera 将变为 2 tera)如果您有 100 个大表(我已经在表号为大约 1800 张桌子)你将浪费这个空间的 100 倍(100 tera),这远非明智之举。

If we will apply indexes in all tables we will have to think about index updates were one update trigger all indexes update this is a select all unordered equivalent in time

如果我们将在所有表中应用索引,我们将不得不考虑索引更新是一个更新触发所有索引更新这是一个选择所有无序等价的时间

from this I conclude that you have in this scenario that if you will loose this time is preferable to lose it in a select nor an update because if you will select a field that is not indexed you will not trigger another select on all fields that are not indexed

由此我得出的结论是,在这种情况下,如果您将失去这个时间,则最好在选择或更新中丢失它,因为如果您将选择一个未编入索引的字段,您将不会在所有已编入索引的字段上触发另一个选择未编入索引

what to index ?

索引什么?

foreign-keys : is a must based on

外键:必须基于

primary-key : I m not yet sure about it may be if someone read this could help on this case

主键:我不确定这可能是因为有人读到这对这种情况有帮助

other fields : the first natural answer is the half of the remaining filds why : if you should index more you r not far from the best answer if you should index less you are not also far because we know that no index is bad and all indexed is also bad.

其他字段:第一个自然答案是剩余字段的一半为什么:如果你应该索引更多,那么你离最佳答案不远如果你应该索引更少,你也不远,因为我们知道没有索引是坏的并且全部被索引也不好。

from this 3 points I can conclude that if we have L fields composed of K keys the limit should be somewhere near ((L-K)/2)+Kmore or less by L/10

从这 3 点我可以得出结论,如果我们有由 K 个键组成的 L 个字段,则限制应该((L-K)/2)+K或多或少接近L/10

this answer is based on my logic and personal prictices

这个答案是基于我的逻辑和个人价格