PostgreSQL 的多租户应用程序模式

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/44524364/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-10-21 02:35:05  来源:igfitidea点击:

PostgreSQL's schemas for multi-tenant applications

performancepostgresqldatabase-designschemamulti-tenant

提问by Vini

I'm learning about multi-tenantapplications and how PostgreSQL's schemas can be used for this.

我正在学习多租户应用程序以及如何为此使用 PostgreSQL 的模式。

Researching the subject, I ended up finding an articlein which the author describes a poor experience when using PostgreSQL's schemas in multi-tenant applications. The main problems would be having bad performance for migrations and high usage of database resources.

研究这个主题,我最终找到了一篇文章,其中作者描述了在多租户应用程序中使用 PostgreSQL 模式时的糟糕体验。主要问题是迁移性能不佳和数据库资源的高使用率。

It seems like having only one schema (sharing the tables among the tenants) would lead to better performance than having one separated schema for each tenant. But it feels strange to me. I would think the opposite, since indexes on smaller tables tend to be lighter than indexes on larger tables.

似乎只有一个模式(在租户之间共享表)比为每个租户使用一个单独的模式会带来更好的性能。但我觉得很奇怪。我认为相反,因为较小表上的索引往往比较大表上的索引轻。

Why would the performance be worse when having data separated in a lot of small tables (in multiple schemas), than having data separated in a few huge tables (in a single schema)?

为什么在许多小表(在多个模式中)中分离数据时,性能会比在几个大表中(在单个模式中)中分离时的性能更差?

回答by FuzzyChef

Performance isn't worse, necessarily. As the article explains, there are specific conditions which make the schema approach better or worse depending on your application design and workload. Let me explain the tradeoffs of the "tenant-schema" vs. "shared-table" approaches:

性能不一定更差。正如文章所解释的,根据您的应用程序设计和工作负载,存在使模式方法更好或更坏的特定条件。让我解释一下“租户模式”与“共享表”方法的权衡:

tenant-schemais best when you have a relatively small number of fairly large tenants. An example of this would be an accounting application, with only paid subscription users. Things which make it the better performing option for you include:

当租户数量相对较少时,租户模式是最好的。一个例子是会计应用程序,只有付费订阅用户。使其成为您性能更好的选择的因素包括:

  • a small number of tenants with a lot of data each
  • a relatively simple schema without a lot of tables per tenant
  • a need to customize the schemas of some tenants
  • ability to make use of database roles per tenant
  • requirement to migrate a tenant's data from one server to another
  • ability to spin up a dedicated appserver in your cloud for each tenant
  • 少量租户,每个租户都有大量数据
  • 一个相对简单的架构,每个租户没有很多表
  • 需要自定义一些租户的模式
  • 能够利用每个租户的数据库角色
  • 将租户的数据从一台服务器迁移到另一台服务器的要求
  • 能够在您的云中为每个租户启动专用的应用程序服务器

Things which make it a poor-performing option include:

使其成为性能不佳的选择的原因包括:

  • lots of tenants with very little data each
  • stateless approach to connections where each request could be any tenant
  • client library or orm which caches metadata for all tables (like ActiveRecord)
  • a requirement for efficient, high-performance connection pooling and/or caching
  • problems with VACUUM and other PostgreSQL administrative operations which scale poorly across 1000's of tables.
  • 很多租户,每个租户的数据很少
  • 连接的无状态方法,其中每个请求可以是任何租户
  • 客户端库或 orm 缓存所有表的元数据(如 ActiveRecord)
  • 对高效、高性能的连接池和/或缓存的要求
  • VACUUM 和其他 PostgreSQL 管理操作的问题,它们在 1000 个表中的扩展性很差。

Whether tenant-schema is bad for migrations/schema changes really depends on how you're doing them. It's bad for rolling out a universal schema change quickly, but good for deploying schema changes as a gradual rollout across tenants.

租户模式是否不利于迁移/模式更改实际上取决于您如何执行它们。这对于快速推出通用架构更改是不利的,但对于在租户之间逐步部署架构更改是有益的。

shared-tableworks better for situations when you have a lot of tenants, and a lot of your tenants have very little data. An example of this would be a social medial mobile application which permits free accounts and thus has thousands of abandoned accounts. Other things which make the shared table model beneficial are:

共享表更适用于有很多租户,而很多租户的数据很少的情况。这方面的一个例子是社交媒体移动应用程序,它允许免费帐户,因此有数千个废弃的帐户。其他使共享表模型有益的事情是:

  • better for connection pooling, as all connections can use the same pool
  • better for PostgreSQL administration, because of fewer tables total
  • better for migrations and schema changes, since there's only one "set" of tables
  • 更适合连接池,因为所有连接都可以使用同一个池
  • 更适合 PostgreSQL 管理,因为总表更少
  • 更适合迁移和架构更改,因为只有一个“一组”表

The main drawback of shared-table is the need to append the tenant filter condition onto every single query in the application layer. It's also problematic because:

共享表的主要缺点是需要将租户过滤条件附加到应用层中的每个查询上。这也有问题,因为:

  • queries which join many tables may perform poorly because the tenant filter throws off query planning
  • tables which grow to 100millions of rows can cause specific performance and maintenance issues
  • no way to do tenant-specific application changes or schema upgrades
  • more expensive to migrate tenants between servers
  • 连接许多表的查询可能性能不佳,因为租户过滤器会破坏查询计划
  • 增长到 1 亿行的表可能会导致特定的性能和维护问题
  • 无法进行特定于租户的应用程序更改或架构升级
  • 在服务器之间迁移租户的成本更高

So which model "performs better" really depends on which tradeoffs hurt you the worst.

因此,哪种模型“表现更好”实际上取决于哪种权衡对您的伤害最大。

There's also a hybrid model, "tenant-view", where the actual data is stored in shared tables, but each application connection uses security barrier viewsto view the data. This has some of the tradeoffs of each model. Primarily, it has the security benefits of the tenant-schema model with some of the performance drawbacks of both models.

还有一个混合模型“租户视图”,其中实际数据存储在共享表中,但每个应用程序连接都使用安全屏障视图来查看数据。这有每个模型的一些权衡。首先,它具有租户模式模型的安全优势以及两种模型的一些性能缺陷。