database 什么是哈希和范围主键?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27329461/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:51:39  来源:igfitidea点击:

What is Hash and Range Primary Key?

hashamazon-dynamodbprimary-keydatabasenosql

提问by Mannu

I am not able to understand what Range primary key is here -

我无法理解这里的 Range 主键是什么 -

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithTables.html#WorkingWithTables.primary.key

http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/WorkingWithTables.html#WorkingWithTables.primary.key

and how does it work?

它是如何工作的?

What do they mean by "unordered hash index on the hash attribute and a sorted range index on the range attribute"?

“散列属性上的无序散列索引和范围属性上的排序范围索引”是什么意思?

回答by mkobit

"Hash and Range Primary Key" means that a single row in DynamoDB has a unique primary key made up of both the hashand the rangekey. For example with a hash key of Xand range key of Y, your primary key is effectively XY. You can also have multiple range keys for the same hash key but the combination must be unique, like XZand XA. Let's use their examples for each type of table:

哈希和范围主键”表示 DynamoDB 中的单行具有由哈希范围键组成的唯一主键。例如,哈希键为X,范围键为Y,您的主键实际上是XY。您还可以为同一个散列键设置多个范围键,但组合必须是唯一的,例如XZXA。让我们将他们的示例用于每种类型的表:

Hash Primary Key – The primary key is made of one attribute, a hash attribute. For example, a ProductCatalog table can have ProductID as its primary key. DynamoDB builds an unordered hash index on this primary key attribute.

散列主键 – 主键由一个属性组成,即散列属性。例如,ProductCatalog 表可以将 ProductID 作为其主键。DynamoDB 在此主键属性上构建无序哈希索引。

This means that every row is keyed off of this value. Every row in DynamoDB will have a required, unique value for this attribute. Unordered hash index means what is says - the data is not ordered and you are not given any guarantees into how the data is stored. You won't be able to make queries on an unordered indexsuch as Get me all rows that have a ProductID greater than X. You write and fetch items based on the hash key. For example, Get me the row from that table that has ProductID X. You are making a query against an unordered index so your gets against it are basically key-value lookups, are very fast, and use very little throughput.

这意味着每一行都与该值无关。DynamoDB 中的每一行都将具有此属性所需的唯一值。无序哈希索引意味着什么 - 数据没有排序,并且您无法保证数据的存储方式。您将无法对无序索引进行查询,例如让我获取 ProductID 大于 X 的所有行。您可以根据哈希键写入和获取项目。例如,从该表中获取具有 ProductID X 的行。您正在对无序索引进行查询,因此您对它的查询基本上是键值查找,速度非常快,并且使用的吞吐量非常小。



Hash and Range Primary Key – The primary key is made of two attributes. The first attribute is the hash attribute and the second attribute is the range attribute. For example, the forum Thread table can have ForumName and Subject as its primary key, where ForumName is the hash attribute and Subject is the range attribute. DynamoDB builds an unordered hash index on the hash attribute and a sorted range index on the range attribute.

哈希和范围主键——主键由两个属性组成。第一个属性是散列属性,第二个属性是范围属性。例如,论坛 Thread 表可以将 ForumName 和 Subject 作为其主键,其中 ForumName 是散列属性,Subject 是范围属性。DynamoDB 在散列属性上构建一个无序散列索引,并在范围属性上构建一个排序的范围索引。

This means that every row's primary key is the combination of the hash and range key. You can make direct gets on single rows if you have both the hash and range key, or you can make a query against the sorted range index. For example, get Get me all rows from the table with Hash key X that have range keys greater than Y, or other queries to that affect. They have better performance and less capacity usage compared to Scans and Queries against fields that are not indexed. From their documentation:

这意味着每一行的主键都是散列键和范围键组合。如果您同时拥有散列键和范围键,则可以对单行进行直接获取,或者您可以对已排序的范围索引进行查询。例如,获取Get me all rows from the table with Hash key X that range keys 大于 Y,或其他影响的查询。与针对未编入索引的字段的扫描和查询相比,它们具有更好的性能和更少的容量使用。从他们的文档

Query results are always sorted by the range key. If the data type of the range key is Number, the results are returned in numeric order; otherwise, the results are returned in order of ASCII character code values. By default, the sort order is ascending. To reverse the order, set the ScanIndexForward parameter to false

查询结果总是按范围键排序。如果范围键的数据类型为Number,则按数字顺序返回结果;否则,按 ASCII 字符代码值的顺序返回结果。默认情况下,排序顺序是升序。要反转顺序,请将 ScanIndexForward 参数设置为 false

I probably missed some things as I typed this out and I only scratched the surface. There are a lotmore aspects to take into consideration when working with DynamoDB tables(throughput, consistency, capacity, other indices, key distribution, etc.). You should take a look at the sample tables and datapage for examples.

当我输入这个时,我可能错过了一些东西,我只是触及了表面。还有很多方面与DynamoDB表格时要考虑到(吞吐量,一致性,容量,其他指数,密钥分发等)。您应该查看示例表和数据页面以获取示例。

回答by Tomer Ben David

As the whole thing is mixing up let's look at it function and code to simulate what it means consicely

由于整个事情正在混淆,让我们看一下它的功能和代码,以合理地模拟它的含义

The onlyway to get a row is via primary key

获取一行的唯一方法是通过主键

getRow(pk: PrimaryKey): Row

getRow(pk: PrimaryKey): Row

Primary key data structure can be this:

主键数据结构可以是这样的:

// If you decide your primary key is just the partition key.
class PrimaryKey(partitionKey: String)

// and in thids case
getRow(somePartitionKey): Row

However you can decide your primary key is partition key + sort key in this case:

但是,在这种情况下,您可以决定主键是分区键 + 排序键:

// if you decide your primary key is partition key + sort key
class PrimaryKey(partitionKey: String, sortKey: String)

getRow(partitionKey, sortKey): Row
getMultipleRows(partitionKey): Row[]

So the bottom line:

  1. Decided that your primary key is partition key only? get single row by partition key.

  2. Decided that your primary key is partition key + sort key? 2.1 Get single row by (partition key, sort key) or get range of rows by (partition key)

所以底线:

  1. 决定你的主键只是分区键?通过分区键获取单行。

  2. 决定你的主键是分区键+排序键?2.1 通过(分区键,排序键)获取单行或通过(分区键)获取行范围

In either way you get a single row by primary key the only question is if you defined that primary key to be partition key only or partition key + sort key

无论哪种方式,您都可以通过主键获得单行,唯一的问题是您是否将该主键定义为仅分区键或分区键 + 排序键

Building blocks are:

构建块是:

  1. Table
  2. Item
  3. KV Attribute.
  1. 桌子
  2. 物品
  3. KV 属性。

Think of Item as a row and of KV Attribute as cells in that row.

将 Item 视为一行,将 KV 属性视为该行中的单元格。

  1. You can get an item (a row) by primary key.
  2. You can get multiple items (multiple rows) by specifying (HashKey, RangeKeyQuery)
  1. 您可以通过主键获取一个项目(一行)。
  2. 您可以通过指定 (HashKey, RangeKeyQuery) 获取多个项目(多行)

You can do (2) only if you decided that your PK is composed of (HashKey, SortKey).

只有当您确定您的 PK 由 (HashKey, SortKey) 组成时,您才能执行 (2)。

More visually as its complex, the way I see it:

更直观的是它的复杂性,我看它的方式:

+----------------------------------------------------------------------------------+
|Table                                                                             |
|+------------------------------------------------------------------------------+  |
||Item                                                                          |  |
||+-----------+ +-----------+ +-----------+ +-----------+                       |  |
|||primaryKey | |kv attr    | |kv attr ...| |kv attr ...|                       |  |
||+-----------+ +-----------+ +-----------+ +-----------+                       |  |
|+------------------------------------------------------------------------------+  |
|+------------------------------------------------------------------------------+  |
||Item                                                                          |  |
||+-----------+ +-----------+ +-----------+ +-----------+ +-----------+         |  |
|||primaryKey | |kv attr    | |kv attr ...| |kv attr ...| |kv attr ...|         |  |
||+-----------+ +-----------+ +-----------+ +-----------+ +-----------+         |  |
|+------------------------------------------------------------------------------+  |
|                                                                                  |
+----------------------------------------------------------------------------------+

+----------------------------------------------------------------------------------+
|1. Always get item by PrimaryKey                                                  |
|2. PK is (Hash,RangeKey), great get MULTIPLE Items by Hash, filter/sort by range     |
|3. PK is HashKey: just get a SINGLE ITEM by hashKey                               |
|                                                      +--------------------------+|
|                                 +---------------+    |getByPK => getBy(1        ||
|                 +-----------+ +>|(HashKey,Range)|--->|hashKey, > < or startWith ||
|              +->|Composite  |-+ +---------------+    |of rangeKeys)             ||
|              |  +-----------+                        +--------------------------+|
|+-----------+ |                                                                   |
||PrimaryKey |-+                                                                   |
|+-----------+ |                                       +--------------------------+|
|              |  +-----------+   +---------------+    |getByPK => get by specific||
|              +->|HashType   |-->|get one item   |--->|hashKey                   ||
|                 +-----------+   +---------------+    |                          ||
|                                                      +--------------------------+|
+----------------------------------------------------------------------------------+

So what is happening above. Notice the following observations. As we said our data belongs to (Table, Item, KVAttribute). Then Every Item has a primary key. Now the way you compose that primary key is meaningful into how you can access the data.

那么上面发生了什么。请注意以下观察结果。正如我们所说,我们的数据属于 (Table, Item, KVAttribute)。那么每个项目都有一个主键。现在,您编写主键的方式对您访问数据的方式很有意义。

If you decide that your PrimaryKey is simply a hash key then great you can get a single item out of it. If you decide however that your primary key is hashKey + SortKey then you could also do a range query on your primary key because you will get your items by (HashKey + SomeRangeFunction(on range key)). So you can get multiple items with your primary key query.

如果您决定您的 PrimaryKey 只是一个散列键,那么您可以从中获取单个项目。但是,如果您决定主键是 hashKey + SortKey,那么您还可以对主键进行范围查询,因为您将通过 (HashKey + SomeRangeFunction(on range key)) 获取项目。因此,您可以使用主键查询获取多个项目。

Note: I did not refer to secondary indexes.

注意:我没有提到二级索引。

回答by Adiii

A well-explained answer is already given by @mkobit, but I will add a big picture of the range key and hash key.

@mkobit 已经给出了一个很好解释的答案,但我将添加范围键和哈希键的大图。

In a simple words range + hash key = composite primary keyCoreComponents of Dynamodbenter image description here

在简单的话range + hash key = composite primary keyDynamodb的CoreComponents在此处输入图片说明

A primary key is consists of a hash key and an optional range key. Hash key is used to select the DynamoDB partition. Partitions are parts of the table data. Range keys are used to sort the items in the partition, if they exist.

主键由散列键和可选的范围键组成。哈希键用于选择 DynamoDB 分区。分区是表数据的一部分。范围键用于对分区中的项目(如果存在)进行排序。

So both have a different purpose and together help to do complex query. In the above example hashkey1 can have multiple n-range.Another example of range and hashkey is game, userA(hashkey)can play Ngame(range)

所以两者都有不同的目的,共同帮助进行复杂的查询。上面例子hashkey1 can have multiple n-range.中range和hashkey的另一个例子是game,userA(hashkey)可以玩Ngame(range)

enter image description here

在此处输入图片说明

The Music table described in Tables, Items, and Attributes is an example of a table with a composite primary key (Artist and SongTitle). You can access any item in the Music table directly, if you provide the Artist and SongTitle values for that item.

A composite primary key gives you additional flexibility when querying data. For example, if you provide only the value for Artist, DynamoDB retrieves all of the songs by that artist. To retrieve only a subset of songs by a particular artist, you can provide a value for Artist along with a range of values for SongTitle.

Tables、Items 和 Attributes 中描述的 Music 表是具有复合主键(Artist 和 SongTitle)的表的示例。如果您提供该项目的 Artist 和 SongTitle 值,您可以直接访问 Music 表中的任何项目。

在查询数据时,复合主键为您提供了额外的灵活性。例如,如果您只提供 Artist 的值,DynamoDB 将检索该艺术家的所有歌曲。要仅检索特定艺术家的歌曲子集,您可以提供 Artist 的值以及 SongTitle 的一系列值。

enter image description here

在此处输入图片说明

https://www.slideshare.net/InfoQ/amazon-dynamodb-design-patterns-best-practiceshttps://www.slideshare.net/AmazonWebServices/awsome-day-2016-module-4-databases-amazon-dynamodb-and-amazon-rdshttps://ceyhunozgun.blogspot.com/2017/04/implementing-object-persistence-with-dynamodb.html

https://www.slideshare.net/InfoQ/amazon-dynamodb-design-patterns-best-practices https://www.slideshare.net/AmazonWebServices/awsome-day-2016-module-4-databases-amazon-dynamodb -and-amazon-rds https://ceyhunozgun.blogspot.com/2017/04/implementing-object-persistence-with-dynamodb.html

回答by Srini Sydney

@vnr you can retrieve all the sort keys associated with a partition key by just using the query using partion key. No need of scan. The point here is partition key is compulsory in a query . Sort key are used only to get range of data

@vnr 您可以通过使用分区键的查询来检索与分区键关联的所有排序键。无需扫描。这里的要点是分区键在查询中是强制性的。排序键仅用于获取数据范围