SQL 如何在 Amazon Redshift 中创建索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/32088161/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to create an Index in Amazon Redshift
提问by user3600910
I'm trying to create indexes in Amazon Redshift but I received an error
我正在尝试在 Amazon Redshift 中创建索引,但收到错误
create index on session_log(UserId);
UserId
is an integer field.
UserId
是一个整数字段。
回答by StuartLC
If you try and create an index (with a name) on a Redshift table:
如果您尝试在 Redshift 表上创建索引(带有名称):
create index IX1 on "SomeTable"("UserId");
You'll receive the error
你会收到错误
An error occurred when executing the SQL command: create index IX1 on "SomeTable"("UserId") ERROR: SQL command "create index IX1 on "SomeTable"("UserId")" not supported on Redshift tables.
执行 SQL 命令时出错:在“SomeTable”(“UserId”)上创建索引 IX1 错误:Redshift 表不支持 SQL 命令“在“SomeTable”(“UserId”)上创建索引 IX1。
This is because, like other data warehouses, Redshift uses columnar storage, and as a result, many of the indexing techniques (like adding non-clustered indexes) used in other RDBMS aren't applicable.
这是因为,与其他数据仓库一样,Redshift 使用列式存储,因此,其他 RDBMS 中使用的许多索引技术(如添加非聚集索引)不适用。
You do however have the option of providing a single sort keyper table, and you can also influence performance with a distribution keyfor sharding your data, and selecting appropriate compression encodingsfor each column to minimize storage and I/O overheads.
但是,您可以选择为每个表提供一个排序键,并且您还可以使用分布键影响性能以对数据进行分片,并为每列选择适当的压缩编码以最小化存储和 I/O 开销。
For example, in your case, you may elect to use UserId
as a sort key:
例如,在您的情况下,您可以选择UserId
用作排序键:
create table if not exists "SomeTable"
(
"UserId" int,
"Name" text
)
sortkey("UserId");
回答by Ricardo Edo
You can Define Constraintsbut will be informational only, as Amazon says: they are not enforced by Amazon Redshift. Nonetheless, primary keys and foreign keys are used as planning hints and they should be declared if your ETL process or some other process in your application enforces their integrity.
您可以定义约束,但仅供参考,正如 Amazon 所说:它们不是由 Amazon Redshift 强制执行的。尽管如此,主键和外键被用作规划提示,如果您的 ETL 过程或应用程序中的某些其他过程强制执行它们的完整性,则应该声明它们。
Some services like pipelines with insert mode (REPLACE_EXISTING) will need a primary key defined in your table.
某些服务,例如具有插入模式 (REPLACE_EXISTING) 的管道,将需要在您的表中定义一个主键。
For other performance purposes the Stuart's response is correct.
对于其他性能目的,Stuart 的响应是正确的。
回答by user3600910
Redshift allow to create primary key
Redshift 允许创建主键
create table user (
id int ,
phone_number int,
primary key(id))
but since Redshift does not enforce this constraints, primary key accepts duplicate values.
但由于 Redshift 不强制执行此约束,因此主键接受重复值。
attached article on that issue
附上关于该问题的文章
http://www.sqlhaven.com/amazon-redshift-what-you-need-to-think-before-defining-primary-key/
http://www.sqlhaven.com/amazon-redshift-what-you-need-to-think-before-defining-primary-key/