oracle 你怎么知道什么是好的索引?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/79241/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you know what a good index is?
提问by Frater
When working with tables in Oracle, how do you know when you are setting up a good index versus a bad index?
在 Oracle 中处理表时,您如何知道何时设置好的索引和坏的索引?
回答by Frater
This depends on what you mean by 'good' and 'bad'. Basically you need to realise that every index you add will increase performance on any search by that column (so adding an index to the 'lastname' column of a person table will increase performance on queries that have "where lastname = " in them) but decrease write performance across the whole table.
这取决于您所说的“好”和“坏”是什么意思。基本上,您需要意识到您添加的每个索引都会提高对该列的任何搜索的性能(因此,向人员表的“姓氏”列添加索引将提高具有“where lastname =”的查询的性能)但是降低整个表的写入性能。
The reason for this is when you add or update a row, it must add-to or update both the table itself and every index that row is a member of. So if you have five indexes on a table, each addition must write to six places - five indexes and the table - and an update may be touching up to six places in the worst case.
这样做的原因是当您添加或更新一行时,它必须添加或更新表本身以及该行所属的每个索引。因此,如果您在一个表上有五个索引,则每次添加都必须写入六个位置 - 五个索引和表 - 在最坏的情况下,更新可能会触及多达六个位置。
Index creation is a balancing act then between query speed and write speed. In some cases, such as a datamart that is only loaded with data once a week in an overnight job but queried thousands of times daily, it makes a great deal of sense to overload with indexes and speed the queries up as much as possible. In the case of online transaction processing systems however, you want to try and find a balance between them.
索引创建是查询速度和写入速度之间的平衡行为。在某些情况下,例如在夜间作业中每周只加载一次数据但每天查询数千次的数据集市,使用索引过载并尽可能加快查询速度非常有意义。然而,在在线交易处理系统的情况下,您想尝试在它们之间找到平衡。
So in short, add indexes to columns that are used a lot in select queries, but try to avoid adding too many and so add the most-used columns first.
所以简而言之,为选择查询中经常使用的列添加索引,但尽量避免添加太多,因此首先添加最常用的列。
After that its a matter of load testing to see how the performance reacts under production conditions, and a lot of tweaking to find an aceeptable balance.
之后就是负载测试,以查看性能在生产条件下的反应,并进行大量调整以找到可接受的平衡。
回答by Frater
Fields that are diverse, highly specific, or unique make good indexes. Such as dates and timestamps, unique incrementing numbers (commonly used as primary keys), person's names, license plate numbers, etc...
多样化、高度特定或独特的字段是很好的索引。如日期和时间戳、唯一递增编号(通常用作主键)、人名、车牌号等...
A counterexample would be gender - there are only two common values, so the index doesn't really help reduce the number of rows that must be scanned.
反例是性别——只有两个共同值,所以索引并不能真正帮助减少必须扫描的行数。
Full-length descriptive free-form strings make poor indexes, as whoever is performing the query rarely knows the exact value of the string.
全长描述性自由格式字符串的索引很差,因为执行查询的人很少知道字符串的确切值。
Linearly-ordered data (such as timestamps or dates) are commonly used as a clustered index, which forces the rows to be stored in index order, and allows in-order access, greatly speeding range queries (e.g. 'give me all the sales orders between October and December'). In such a case the DB engine can simply seek to the first record specified by the range and start reading sequentially until it hits the last one.
线性排序的数据(例如时间戳或日期)通常用作聚簇索引,它强制将行按索引顺序存储,并允许按顺序访问,大大加快范围查询(例如'给我所有销售订单10 月到 12 月之间')。在这种情况下,DB 引擎可以简单地查找范围指定的第一条记录并开始顺序读取,直到找到最后一条记录。
回答by SquareCog
@Infamous Cow -- you must be thinking of primary keys, not indexes.
@臭名昭著的牛——你一定在考虑主键,而不是索引。
@Xenph Yan -- Something others have not touched on is choosing what kindof index to create. Some databases don't really give you much of a choice, but some have a large variety of possible indexes. B-trees are the defaultbut not always the best kind of index. Choosing the right structure depends on the kind of usage you expect to have. What kind of queries do you need to support most? Are you in a read-mostly or write-mostly environment? Are your writes dominated by updates or appends? Etc, etc.
@Xenph Yan——其他人没有涉及的事情是选择要创建的索引类型。有些数据库并没有真正给您太多选择,但有些数据库有多种可能的索引。B 树是默认的,但并不总是最好的索引。选择正确的结构取决于您期望的使用类型。您最需要支持哪些类型的查询?您处于以读取为主还是以写入为主的环境中?您的写入是由更新还是追加主导?等等等等。
A description of the different types of indexes and their pros and cons is available here: http://20bits.com/2008/05/13/interview-questions-database-indexes/.
此处提供了对不同类型索引及其优缺点的描述:http: //20bits.com/2008/05/13/interview-questions-database-indexes/。
回答by Matt Rogish
Here's a great SQL Server article: http://www.sql-server-performance.com/tips/optimizing_indexes_general_p1.aspx
这是一篇很棒的 SQL Server 文章:http: //www.sql-server-performance.com/tips/optimizing_indexes_general_p1.aspx
Although the mechanics won't work on Oracle, the tips are very apropos (minus the thing on clustered indexes, which don't quite work the same way in Oracle).
尽管该机制在 Oracle 上不起作用,但这些提示非常恰当(减去有关聚集索引的事情,这在 Oracle 中的工作方式并不完全相同)。
回答by WW.
Some rules of thumb if you are trying to improve a particular query.
如果您正在尝试改进特定查询,则有一些经验法则。
For a particular table (where you think Oracle should start) try indexing each of the columns used in the WHERE clause. Put columns with equality first, followed by columns with a range or like.
对于特定的表(您认为 Oracle 应该从该表开始)尝试索引 WHERE 子句中使用的每个列。首先放置相等的列,然后是具有范围或类似值的列。
For example:
例如:
WHERE CompanyCode = ? AND Amount BETWEEN 100 AND 200
If columns are very large in size (e.g. you are storing some XML or something) you may be better off leaving them out of the index. This will make the index smaller to scan, assuming you have to go to the table row to satisfy the select list anyway.
如果列的大小非常大(例如,您正在存储一些 XML 或其他内容),最好将它们排除在索引之外。这将使要扫描的索引变小,假设您无论如何都必须转到表行以满足选择列表。
Alternatively, if all the values in the SELECT and WHERE clauses are in the index Oracle will not need to access the table row. So sometimes it is a good idea to put the selected values last in the index and avoid a table access all together.
或者,如果 SELECT 和 WHERE 子句中的所有值都在索引中,Oracle 将不需要访问表行。因此,有时将选定的值放在索引的最后并避免一起访问表是个好主意。
You could write a book about the best ways to index - look for author Jonathan Lewis.
您可以写一本关于最佳索引方法的书 - 寻找作者 Jonathan Lewis。
回答by Christopher Dolan
A good index is something that you can rely on to be unique for a specific table row.
一个好的索引是你可以依赖的,对于特定的表行是唯一的。
One commonly used index scheme is the use of numbers which increment by 1 for each row in the table. Every row will end up having a different number index.
一种常用的索引方案是使用对表中的每一行递增 1 的数字。每一行最终都会有一个不同的数字索引。