在 postgresql 中水平分片的好方法是什么

Question

提问by pylabs

what is a good way to horizontal shard in postgresql

在 postgresql 中水平分片的好方法是什么

1. pgpool 2
2. gridsql

which is a better way to use sharding

这是使用分片的更好方法

also is it possible to paritition without changing client code

也可以在不更改客户端代码的情况下进行分区

It would be great if some one can share a simple tutorial or cookbook example of how to setup and use sharding

如果有人可以分享有关如何设置和使用分片的简单教程或食谱示例，那就太好了

Answer 1

回答by WolfmanDragon

PostgreSQL allows partitioning in two different ways. One is by range and the other is by list. Both use table inheritance to do partition.
Partitioning by range, usually a date range, is the most common, but partitioning by list can be useful if the variables that is the partition are static and not skewed.

PostgreSQL 允许以两种不同的方式进行分区。一种是按范围，另一种是按列表。两者都使用表继承来进行分区。
按范围（通常是日期范围）分区是最常见的，但如果作为分区的变量是静态的且没有倾斜，则按列表分区会很有用。

Partitioning is done with table inheritance so the first thing to do is set up new child tables.

分区是通过表继承完成的，所以首先要做的是设置新的子表。

CREATE TABLE measurement (
    x        int not null,
    y        date not null,
    z        int
);

CREATE TABLE measurement_y2006 ( 
    CHECK ( logdate >= DATE '2006-01-01' AND logdate < DATE '2007-01-01' )
) INHERITS (measurement);

CREATE TABLE measurement_y2007 (
    CHECK ( logdate >= DATE '2007-01-01' AND logdate < DATE '2008-01-01' ) 
) INHERITS (measurement);

Then either rules or triggers need to be used to drop the data in the correct tables. Rules are faster on bulk updates, triggers on single updates as well as being easier to maintain. Here is a sample trigger.

然后需要使用规则或触发器将数据删除到正确的表中。规则在批量更新上更快，在单个更新上触发并且更容易维护。这是一个示例触发器。

CREATE TRIGGER insert_measurement_trigger
    BEFORE INSERT ON measurement
    FOR EACH ROW EXECUTE PROCEDURE measurement_insert_trigger();

and the trigger function to do the insert

和触发功能来做插入

CREATE OR REPLACE FUNCTION measurement_insert_trigger()
RETURNS TRIGGER AS $$
BEGIN
    IF ( NEW.logdate >= DATE '2006-01-01' 
         AND NEW.logdate < DATE '2007-01-01' ) THEN
        INSERT INTO measurement_y2006 VALUES (NEW.*);
    ELSIF ( NEW.logdate >= DATE '2007-01-01' 
            AND NEW.logdate < DATE '2008-01-01' ) THEN
        INSERT INTO measurement_y2006m03 VALUES (NEW.*);
    ELSE
        RAISE EXCEPTION 'Date out of range.';
    END IF;
    RETURN NULL;
END;
$$
LANGUAGE plpgsql;

These examples are simplified versions of the postgresql documentation for easier reading.

这些示例是 postgresql 文档的简化版本，以便于阅读。

I am not familiar with pgpool2, but gridsql is a commercial product designed for EnterpriseDB, a commercial database that is built on top of postgresql. Their products are very good, but I do not think that it will work on standard postgresl.

我对 pgpool2 不熟悉，但 gridsql 是为 EnterpriseDB 设计的商业产品，它是一个建立在 postgresql 之上的商业数据库。他们的产品非常好，但我认为它不适用于标准 postgresl。

Answer 2

回答by TuxRacer

Well, if the question is about sharding, then pgpool and postgresql partitioning features are not valid answers.

好吧，如果问题是关于分片的，那么 pgpool 和 postgresql 分区功能就不是有效的答案。

Partitioningassumes the partitions are on the same server. Shardingis more general and is usually used when the database is split on several servers. Sharding is used when Partitioning is not possible any more, e.g for large database that cannot fit on a single disk.

分区假设分区在同一台服务器上。分片更通用，通常用于将数据库拆分在多个服务器上的情况。当分区不再可能时使用分片，例如对于无法容纳在单个磁盘上的大型数据库。

For true sharding then Skype's pl/proxy is probably the best.

对于真正的分片，Skype 的 pl/proxy 可能是最好的。

Answer 3

回答by Magnus Hagander

pl/proxy (by Skype) is a good solution for this. It requires your access to be through a function API, but once you have that it can make it pretty transparent.

pl/proxy (by Skype) 是一个很好的解决方案。它要求您通过函数 API 进行访问，但是一旦您拥有它，它就可以使其变得非常透明。

Answer 4

回答by Adrian Hartanto

Best practice to achieve PostgreSQL cluster is using:

实现 PostgreSQL 集群的最佳实践是使用：

PostgreSQL Partition (range or list).
Combine PostgreSQL partition and tablespace in several SSD.
PostgreSQL FDW extension.

PostgreSQL 分区（范围或列表）。
将 PostgreSQL 分区和表空间组合在多个 SSD 中。
PostgreSQL FDW 扩展。

Alternative: Postgres-XL

替代方案：Postgres-XL

For Sharding (loadbalance) you can use:

对于分片（负载平衡），您可以使用：

Postgres-BDR
Postgres-X2

Postgres-BDR
Postgres-X2

Note:

笔记：

Cluster purpose is contain big datasetand mostly for data warehouse.

集群目的是包含大数据集，主要用于数据仓库。

Sharding purpose is for loadbalance and mostly used for high-transactiondatabase.

分片目的是为了负载平衡，主要用于高事务数据库。

** WARNING**

**警告**

avoid pgpool because too many overhead that will lead issue in the future.

避免使用 pgpool，因为太多的开销会导致将来出现问题。

Hope this answer will help you in future development.

希望这个答案对你未来的发展有所帮助。

在 postgresql 中水平分片的好方法是什么

提问by pylabs

what is a good way to horizontal shard in postgresql

在 postgresql 中水平分片的好方法是什么

回答by WolfmanDragon

回答by TuxRacer

回答by Magnus Hagander

回答by Adrian Hartanto

相关推荐

最近更新

标签

在 postgresql 中水平分片的好方法是什么

提问by pylabs

what is a good way to horizontal shard in postgresql

在 postgresql 中水平分片的好方法是什么

回答by WolfmanDragon

回答by TuxRacer

回答by Magnus Hagander

回答by Adrian Hartanto

相关推荐

postgresql 如何从PostgreSQL数据库中删除表*或*视图？

.Net PostgreSQL 连接字符串

postgresql 为什么我的 LIKE '%\_' 查询返回所有行，而不仅仅是那些以下划线结尾的行？

postgresql SQL - min() 获得最低值，max() 获得最高值，如果我想要第 2 个（或第 5 个或第 n 个）最低值怎么办？

相关推荐

最近更新

标签

postgresql 如何从PostgreSQL数据库中删除表或视图？