SQL 如何在 Amazon Redshift 中创建索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32088161/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 03:57:40  来源:igfitidea点击:

How to create an Index in Amazon Redshift

sqlamazon-web-servicesindexingamazon-redshift

提问by user3600910

I'm trying to create indexes in Amazon Redshift but I received an error

我正在尝试在 Amazon Redshift 中创建索引,但收到错误

create index on session_log(UserId);

UserIdis an integer field.

UserId是一个整数字段。

回答by StuartLC

If you try and create an index (with a name) on a Redshift table:

如果您尝试在 Redshift 表上创建索引(带有名称):

create index IX1 on "SomeTable"("UserId");

You'll receive the error

你会收到错误

An error occurred when executing the SQL command: create index IX1 on "SomeTable"("UserId") ERROR: SQL command "create index IX1 on "SomeTable"("UserId")" not supported on Redshift tables.

执行 SQL 命令时出错:在“SomeTable”(“UserId”)上创建索引 IX1 错误:Redshift 表不支持 SQL 命令“在“SomeTable”(“UserId”)上创建索引 IX1。

This is because, like other data warehouses, Redshift uses columnar storage, and as a result, many of the indexing techniques (like adding non-clustered indexes) used in other RDBMS aren't applicable.

这是因为,与其他数据仓库一样,Redshift 使用列式存储,因此,其他 RDBMS 中使用的许多索引技术(如添加非聚集索引)不适用。

You do however have the option of providing a single sort keyper table, and you can also influence performance with a distribution keyfor sharding your data, and selecting appropriate compression encodingsfor each column to minimize storage and I/O overheads.

但是,您可以选择为每个表提供一个排序键,并且您还可以使用分布键影响性能以对数据进行分片,并为每列选择适当的压缩编码以最小化存储和 I/O 开销。

For example, in your case, you may elect to use UserIdas a sort key:

例如,在您的情况下,您可以选择UserId用作排序键:

create table if not exists "SomeTable"
(
    "UserId" int,
    "Name" text
)
sortkey("UserId");

You might want to read a few primers likethese

你可能想读几引物这些

回答by Ricardo Edo

You can Define Constraintsbut will be informational only, as Amazon says: they are not enforced by Amazon Redshift. Nonetheless, primary keys and foreign keys are used as planning hints and they should be declared if your ETL process or some other process in your application enforces their integrity.

您可以定义约束,但仅供参考,正如 Amazon 所说:它们不是由 Amazon Redshift 强制执行的。尽管如此,主键和外键被用作规划提示,如果您的 ETL 过程或应用程序中的某些其他过程强制执行它们的完整性,则应该声明它们。

Some services like pipelines with insert mode (REPLACE_EXISTING) will need a primary key defined in your table.

某些服务,例如具有插入模式 (REPLACE_EXISTING) 的管道,将需要在您的表中定义一个主键。

For other performance purposes the Stuart's response is correct.

对于其他性能目的,Stuart 的响应是正确的。

回答by user3600910

Redshift allow to create primary key

Redshift 允许创建主键

create table user (
id int ,
phone_number int,
primary key(id))

but since Redshift does not enforce this constraints, primary key accepts duplicate values.

但由于 Redshift 不强制执行此约束,因此主键接受重复值。

attached article on that issue

附上关于该问题的文章

http://www.sqlhaven.com/amazon-redshift-what-you-need-to-think-before-defining-primary-key/

http://www.sqlhaven.com/amazon-redshift-what-you-need-to-think-before-defining-primary-key/