SQL 决定何时在数据库中的表列上创建索引?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7744038/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 12:27:24  来源:igfitidea点击:

Decision when to create Index on table column in database?

sqloracleindexing

提问by M Sach

I am not db guy. But I need to create tables and do CRUD operations on them. I get confused should I create the index on all columns by default or not? Here is my understanding which I consider while creating index.

我不是数据库人。但是我需要创建表并对它们进行 CRUD 操作。我感到困惑是否应该默认在所有列上创建索引?这是我在创建索引时考虑的理解。

Index basically contains the memory location range ( starting memory location where first value is stored to end memory location where last value is stored). So when we insert any value in table index for column needs to be updated as it has got one more value but update of column value wont have any impact on index value. Right?So bottom line is when my column is used in join between two tables we should consider creating index on column used in join but all other columns can be skipped because if we create index on them it will involve extra cost of updating index value when new value is inserted in column.Right?

索引基本上包含内存位置范围(存储第一个值的起始内存位置到存储最后一个值的结束内存位置)。因此,当我们在表中插入任何值时,列的索引需要更新,因为它又多了一个值,但列值的更新不会对索引值产生任何影响。对?所以底线是当我的列用于两个表之间的连接时,我们应该考虑在连接中使用的列上创建索引,但可以跳过所有其他列,因为如果我们在它们上创建索引,它将涉及在新值时更新索引值的额外成本插入到列中。对?

Consider this scenario where table mytablecontains two three columns i.e col1,col2,col3. Now we fire this query

考虑这种情况,其中 tablemytable包含两个三列,即col1, col2, col3。现在我们触发这个查询

select col1,col2 from mytable

Now there are two cases here. In first case we create the index on col1and col2. In second case we don't create any index.** As per my understanding case 1 will be faster than case2 because in case 1 we oracle can quickly find column memory location. So here I have not used any join columns but still index is helping here. So should I consider creating index here or not?**

现在这里有两种情况。在第一种情况下,我们在col1和上创建索引col2。在第二种情况下,我们不创建任何索引。** 根据我的理解,情况 1 将比情况 2 快,因为在情况 1 中,我们 oracle 可以快速找到列内存位置。所以在这里我没有使用任何连接列,但索引在这里仍然有帮助。那么我是否应该考虑在此处创建索引?**

What if in the same scenario above if we fire

如果在上述相同的情况下我们开火怎么办

select * from mytable

instead of

代替

select col1,col2 from mytable

Will index help here?

索引在这里有帮助吗?

采纳答案by Ian Carpenter

but update of column value wont have any impact on index value. Right?

但列值的更新不会对索引值产生任何影响。对?

No. Updating an indexed column will have an impact. The Oracle 11g performance manualstates that:

不会。更新索引列会产生影响。Oracle 11g性能手册指出:

UPDATE statements that modify indexed columns and INSERT and DELETE statements that modify indexed tables take longer than if there were no index. Such SQL statements must modify data in indexes and data in tables. They also create additional undo and redo.

修改索引列的 UPDATE 语句以及修改索引表的 INSERT 和 DELETE 语句比没有索引时花费的时间更长。此类 SQL 语句必须修改索引中的数据和表中的数据。它们还创建了额外的撤消和重做。



So bottom line is when my column is used in join between two tables we should consider creating index on column used in join but all other columns can be skipped because if we create index on them it will involve extra cost of updating index value when new value is inserted in column. Right?

所以底线是当我的列用于两个表之间的连接时,我们应该考虑在连接中使用的列上创建索引,但可以跳过所有其他列,因为如果我们在它们上创建索引,它将涉及在新值时更新索引值的额外成本插入到列中。对?

Not just Inserts but any other Data Manipulation Language statement.

不仅是插入,还有任何其他数据操作语言语句。

Consider this scenario . . . Will index help here?

考虑这种情况。. . 索引在这里有帮助吗?

With regards to this last paragraph, why not build some test cases with representative data volumes so that you prove or disprove your assumptions about which columns you should index?

关于最后一段,为什么不构建一些具有代表性数据量的测试用例,以便证明或反驳关于应该索引哪些列的假设?

回答by santiagobasulto

Don't create Indexes in every column! It will slow things down on insert/delete/update operations.

不要在每一列中创建索引!它会减慢插入/删除/更新操作的速度。

As a simple reminder, you can create an index in columns that are common in WHERE, ORDER BYand GROUP BYclauses. You may consider adding an index in colums that are used to relate other tables (through a JOIN, for example)

简单提醒一下,您可以在WHERE,ORDER BYGROUP BY子句中常见的列中创建索引。您可以考虑在用于关联其他表的列中添加索引(JOIN例如,通过 a )

Example:

例子:

SELECT col1,col2,col3 FROM my_table WHERE col2=1

Here, creating an index on col2 would help this query a lot.

在这里,在 col2 上创建索引对这个查询有很大帮助。

Also, consider index selectivity. Simply put, create index on values that has a "big domain", i.e. Ids, names, etc. Don't create them on Male/Female columns.

另外,请考虑索引选择性。简单地说,在具有“大域”的值上创建索引,即 ID、名称等。不要在男性/女性列上创建它们。

回答by Cade Roux

In the specific scenario you give, there is no WHERE clause, so a table scan is going to be used or the index scan will be used, but you're only dropping one column, so the performance might not be that different. In the second scenario, the index shouldn't be used, since it isn't covering and there is no WHERE clause. If there were a WHERE clause, the index could allow the filtering to reduce the number of rows which need to be looked up to get the missing column.

在您给出的特定场景中,没有 WHERE 子句,因此将使用表扫描或索引扫描,但您只删除一列,因此性能可能没有太大不同。在第二种情况下,不应使用索引,因为它没有覆盖并且没有 WHERE 子句。如果有 WHERE 子句,索引可以允许过滤减少需要查找以获取丢失列的行数。

Oracle has a number of different tables, including heap or index organized tables.

Oracle 有许多不同的表,包括堆或索引组织的表。

If an index is covering, it is more likely to be used, especially when selective. But note that an index organized table is not better than a covering index on a heap when there are constraints in the WHERE clause and far fewer columns in the covering index than in the base table.

如果索引覆盖,则更有可能使用它,尤其是在选择性时。但请注意,当 WHERE 子句中存在约束且覆盖索引中的列比基表中的列少得多时,索引组织表并不比堆上的覆盖索引好。

Creating indexes with more columns than are actually used only helps if they are more likely to make the index covering, but adding all the columns would be similar to an index organized table. Note that Oracle does not have the equivalent of SQL Server's INCLUDE (COLUMN) which can be used to make indexes more covering (it's effectively making an additional clustered index of only a subset of the columns - useful if you want an index to be unique but also add some data which you don't want to be considered in the uniqueness but helps to make it covering for more queries)

创建包含比实际使用更多列的索引仅在它们更有可能覆盖索引时才有帮助,但添加所有列将类似于索引组织表。请注意,Oracle 没有与 SQL Server 的 INCLUDE (COLUMN) 等效的功能,它可用于使索引更具覆盖性(它有效地创建了仅列子集的附加聚集索引 - 如果您希望索引唯一但很有用还添加一些您不想在唯一性中考虑但有助于使其覆盖更多查询的数据)

You need to look at your plans and then determine if indexes will help things. And then look at the plans afterwards to see if they made a difference.

您需要查看您的计划,然后确定索引是否有帮助。然后再查看计划,看看它们是否有所作为。