什么是 MYSQL 分区?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1579930/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 14:17:36  来源:igfitidea点击:

What is MYSQL Partitioning?

mysqldatabasepartitioning

提问by TIMEX

I have read the documentation (http://dev.mysql.com/doc/refman/5.1/en/partitioning.html), but I would like, in your own words, what it is and why it is used.

我已经阅读了文档(http://dev.mysql.com/doc/refman/5.1/en/partitioning.html),但我想用你自己的话来说,它是什么以及为什么使用它。

  1. Is it mainly used for multiple servers so it doesn't drag down one server?
  2. So, part of the data will be on server1, and part of the data will be on server2. And server 3 will "point" to server1 or server2...is that how it works?
  3. Why does MYSQL documentation focus on partitioning within the same server...if the purpose is to spread it across servers?
  1. 它主要用于多台服务器,因此不会拖累一台服务器吗?
  2. 因此,部分数据将在 server1 上,部分数据将在 server2 上。服务器 3 将“指向”服务器 1 或服务器 2 ......它是如何工作的?
  3. 为什么 MYSQL 文档侧重于在同一服务器内进行分区……如果目的是将其分布在服务器之间?

回答by Szymon Lipiński

The idea behind partitioning isn't to use multiple servers but to use multiple tables instead of one table. You can divide a table into many tables so that you can have old data in one sub table and new data in another table. Then the database can optimize queries where you ask for new data knowing that they are in the second table. What's more, you define how the data is partitioned.

分区背后的想法不是使用多个服务器,而是使用多个表而不是一个表。您可以将一个表拆分为多个表,以便在一个子表中保留旧数据,在另一个表中保留新数据。然后,数据库可以优化查询,在知道新数据位于第二个表中的情况下,您可以在其中请求新数据。更重要的是,您可以定义数据的分区方式。

Simple example from the MySQL Documentation:

MySQL 文档中的简单示例:

CREATE TABLE employees (
    id INT NOT NULL,
    fname VARCHAR(30),
    lname VARCHAR(30),
    hired DATE NOT NULL DEFAULT '1970-01-01',
    separated DATE NOT NULL DEFAULT '9999-12-31',
    job_code INT,
    store_id INT
)
PARTITION BY RANGE ( YEAR(separated) ) (
    PARTITION p0 VALUES LESS THAN (1991),
    PARTITION p1 VALUES LESS THAN (1996),
    PARTITION p2 VALUES LESS THAN (2001),
    PARTITION p3 VALUES LESS THAN MAXVALUE
);

This allows to speed up e.g.:

这允许加速例如:

  1. Dropping old data by simple:

    ALTER TABLE employees DROP PARTITION p0;
    
  2. Database can speed up a query like this:

    SELECT COUNT(*)
    FROM employees
    WHERE separated BETWEEN '2000-01-01' AND '2000-12-31'
    GROUP BY store_id;
    
  1. 通过简单的方式删除旧数据:

    ALTER TABLE employees DROP PARTITION p0;
    
  2. 数据库可以加速这样的查询:

    SELECT COUNT(*)
    FROM employees
    WHERE separated BETWEEN '2000-01-01' AND '2000-12-31'
    GROUP BY store_id;
    

Knowing that all data is stored only on the p2 partition.

知道所有数据只存储在 p2 分区上。

回答by Alireza Rahmani Khalili

A partitionedtable is a single logical table that's composed of multiple physical subtables. The partitioning code is really just a wrapper around a set of Handler objects that represent the underlying partitions, and it forwards requests to the storage engine through the Handler objects. Partitioning is a kind of black box that hides the underlying partitions from you at the SQL layer, although you can see them quite easily by looking at the filesystem, where you'll see the component tables with a hash-delimited naming convention.

一个分区表是由真实多个物理子表的单一逻辑表。分区代码实际上只是对代表底层分区的一组 Handler 对象的包装,它通过 Handler 对象将请求转发到存储引擎。分区是一种黑匣子,它在 SQL 层向您隐藏底层分区,尽管您可以通过查看文件系统很容易地看到它们,在那里您将看到具有散列分隔命名约定的组件表。

For example, here's a simple way to place each year's worth of sales into a separate partition:

例如,这里有一种简单的方法可以将每年的销售额放在一个单独的分区中:

CREATE TABLE sales (
 order_date DATETIME NOT NULL,
 -- Other columns omitted
) ENGINE=InnoDB PARTITION BY RANGE(YEAR(order_date)) (
 PARTITION p_2010 VALUES LESS THAN (2010),
 PARTITION p_2011 VALUES LESS THAN (2011),
 PARTITION p_2012 VALUES LESS THAN (2012),
 PARTITION p_catchall VALUES LESS THAN MAXVALUE );

read more here.

在这里阅读更多。

回答by Borniet

It is not really about using different server instances (although that is sometimes a possibility), it is more about dividing your tables in different physical partitions.

它实际上并不是关于使用不同的服务器实例(尽管有时这是可能的),它更多的是关于将您的表划分到不同的物理分区中。

It's dividing your tables and indexes into smaller pieces, and even subdivide it into even smaller pieces. Think of it as having several million different magazines of different topics and different years (say 2000-2019) all in one big warehouse (one big table). Partitioning would mean that you would put them organized in different rooms inside that big warehouse. They still belong together inside the one warehouse, but now you group them on a logical level, depending on your database partitioning strategy.

它将您的表和索引划分为更小的部分,甚至将其细分为更小的部分。可以将其想象为在一个大仓库(一张大桌子)中拥有数百万本不同主题和不同年份(例如 2000-2019 年)的不同杂志。分区意味着你会将它们组织在那个大仓库内的不同房间中。它们仍然属于同一个仓库,但现在您可以根据您的数据库分区策略在逻辑级别对它们进行分组。

Indexing is actually like keeping a table of which magazine is where in your warehouse, or in your rooms inside your warehouse. As you can see, there is a big difference between database partitioning and indexing, and they can be very well used together.

索引实际上就像在您的仓库中或您仓库内的房间中保留一份杂志的表格。如您所见,数据库分区和索引之间存在很大差异,它们可以很好地结合使用。

You can read more about it on my website on this article about Database Partitioning

您可以在我的网站上阅读有关数据库分区的这篇文章的更多信息