postgresql 唯一索引对列搜索性能更好吗？(PGSQL & MySQL)

Question

提问by Alex Balashov

I am curious as to whether

我很好奇是否

CREATE INDEX idx ON tbl (columns);

vs.

对比

CREATE UNIQUE INDEX idx ON tbl (columns);

has a significant algorithmic performance benefit in PostgreSQL or MySQL implementations when scanning the indexed column(s), or whether the UNIQUEkeyword simply introduces a unique constraint alongside the index.

在扫描索引列时，或者UNIQUE关键字是否简单地在索引旁边引入唯一约束时，在 PostgreSQL 或 MySQL 实现中具有显着的算法性能优势。

I imagine it is probably fair to say that there is a marginal benefit insofar as indexes are likely to be internally implemented as some sort of hash¹-like structure, and collision handling by definition result in something other than O(1) performance. Given this premise, it is likely that if a large percentage of values are identical than the structure degenerates into something linear.

我想可以公平地说，就索引可能在内部实现为某种类似散列¹的结构而言，有一个边际收益可能是公平的，并且根据定义的冲突处理会导致 O(1) 性能以外的其他东西。鉴于此前提，如果大部分值相同，则结构很可能退化为线性。

So, for purposes of my question, assume that the distribution of values is relativelydiscrete and uniform.

因此，就我的问题而言，假设值的分布相对离散和均匀。

Thanks in advance!

提前致谢！

^{1 Which is a matter of pure speculation for me, as I am not familiar with RDBM internals.}

^{1 这对我来说纯属猜测，因为我不熟悉 RDBM 内部结构。}

Answer 1

采纳答案by Quassnoi

If your data are unique, you should create a UNIQUEindex on them.

如果您的数据是唯一的，您应该为UNIQUE它们创建一个索引。

This implies no additional overhead and affects optimizer's decisions in certain cases so that it can choose a better algorithm.

这意味着没有额外的开销并在某些情况下影响优化器的决策，以便它可以选择更好的算法。

In SQL Serverand in PostgreSQL, for instance, if you sort on a UNIQUEkey, the optimizer ignores the ORDER BYclauses used after that (since they are irrelevant), i. e. this query:

例如 inSQL Server和 in PostgreSQL，如果你对一个UNIQUE键进行排序，优化器会忽略ORDER BY之后使用的子句（因为它们不相关），即这个查询：

SELECT  *
FROM    mytable
ORDER BY
        col_unique, other_col
LIMIT 10

will use an index on col_uniqueand won't sort on other_colbecause it's useless.

将使用索引col_unique并且不会排序，other_col因为它没用。

This query:

这个查询：

SELECT  *
FROM    mytable
WHERE   mycol IN
        (
        SELECT  othercol
        FROM    othertable
        )

will also be converted into an INNER JOIN(as opposed to a SEMI JOIN) if there is a UNIQUEindex on othertable.othercol.

如果上有索引，也将被转换为 an INNER JOIN（而不是 a SEMI JOIN）。UNIQUEothertable.othercol

An index always contains some kind of a pointer to the row (ctidin PostgreSQL, row pointer in MyISAM, primary key/uniquifier in InnoDB) and the leaves are ordered on these pointers, so in fact every index leaf is unique is some way (though it may not be obvious).

索引总是包含某种指向行的指针（ctidin PostgreSQL，行指针 in MyISAM，主键/唯一符 in InnoDB）并且叶子在这些指针上排序，因此实际上每个索引叶子在某种程度上都是唯一的（尽管它可能不是很明显）。

See this article in my blog for performance details:

有关性能详细信息，请参阅我博客中的这篇文章：

Making an index UNIQUE

制作索引 UNIQUE

Answer 2

回答by Eric

There is a small penalty during update/insert operations for having the unique constraint. It has to search before the insert/update operation to make sure the uniqueness constraint isn't violated.

在更新/插入操作期间有一个小的惩罚，因为具有唯一约束。它必须在插入/更新操作之前进行搜索以确保不违反唯一性约束。

Answer 3

回答by Eric

Well, usually indexes are B-Trees, not hashes (there are hash based indexes, but the most common index (at least in PostgreSQL) is bases on B Tree).

嗯，通常索引是 B 树，而不是哈希（有基于哈希的索引，但最常见的索引（至少在 PostgreSQL 中）是基于 B 树的）。

As for speed - unique should be faster - when index scanning finds row with given value, it doesn't have to search if there are any other rows with this value, and can finish scanning imemdiately.

至于速度——unique 应该更快——当索引扫描找到具有给定值的行时，它不必搜索是否还有其他具有该值的行，并且可以立即完成扫描。

postgresql 唯一索引对列搜索性能更好吗？(PGSQL & MySQL)

提问by Alex Balashov

采纳答案by Quassnoi

回答by Eric

回答by Eric

相关推荐

最近更新

标签

postgresql 唯一索引对列搜索性能更好吗？(PGSQL & MySQL)

提问by Alex Balashov

采纳答案by Quassnoi

回答by Eric

回答by Eric

相关推荐

B-Tree 和 GiST 索引方法（在 PostgreSQL 中）有什么区别？

从 MySQL 切换到 PostgreSQL - 提示、技巧和陷阱？

postgresql psycopg2“类型错误：并非所有参数都在字符串格式化期间转换”

postgresql Java 枚举、JPA 和 Postgres 枚举 - 如何让它们协同工作？

相关推荐

最近更新

标签