database PostgreSQL:使用多个数据库,每个数据库一个架构,还是一个数据库多个架构更好?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1152405/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 07:24:36  来源:igfitidea点击:

PostgreSQL: Is it better to use multiple databases with one schema each, or one database with multiple schemas?

databasedatabase-designpostgresqldatabase-permissions

提问by Strae

After this commentto one of my question, I'm thinking if it is better using one database with X schemas or vice versa.

在对我的一个问题发表评论之后,我在想使用一个具有 X 模式的数据库是否更好,反之亦然。

My situation: I'm developing a web application where, when people register, I create (actually) a database (no, it's not a social network: everyone must have access to his own data and never see the data of the other user).

我的情况:我正在开发一个 Web 应用程序,当人们注册时,我创建(实际上)一个数据库(不,它不是社交网络:每个人都必须有权访问自己的数据,永远不会看到其他用户的数据) .

That's the way I used for the previous version of my application (that is still running on MySQL): through the Plesk API, for every registration, I do:

这就是我用于我的应用程序的先前版本(仍在 MySQL 上运行)的方式:通过 Plesk API,对于每次注册,我执行以下操作:

  1. Create a database user with limited privileges;
  2. Create a database that can be accessed just by the previous created user and the superuser (for maintenance)
  3. Populate the database
  1. 创建一个权限有限的数据库用户;
  2. 创建一个只能由之前创建的用户和超级用户访问的数据库(用于维护)
  3. 填充数据库

Now, I'll need to do the same with PostgreSQL (the project is getting mature and MySQL... don't fulfill all the needs).

现在,我需要对 PostgreSQL 做同样的事情(该项目正在变得成熟,而 MySQL ... 不能满足所有需求)。

I need to have all the databases/schemas backups independent: pg_dump works perfectly in both ways, and the same for the users that can be configured to access just one schema or one database.

我需要让所有的数据库/模式备份都是独立的:pg_dump 在两种方式下都能完美运行,对于可以配置为仅访问一个模式或一个数据库的用户来说也是如此。

So, assuming you are more experienced PostgreSQL users than me, what do you think is the best solution for my situation, and why?

因此,假设您是比我更有经验的 PostgreSQL 用户,您认为最适合我的情况的解决方案是什么,为什么?

Will there be performance differences using $x database instead of $x schemas? And what solution will be better to maintain in the future (reliability)?

使用 $x 数据库而不是 $x 模式会有性能差异吗?将来什么解决方案会更好地维护(可靠性)?

All of my databases/schemas will alwayshave the same structure!

我所有的数据库/模式将始终具有相同的结构!

For the backups issue (using pg_dump), is maybe better using one database and many schemas, dumping all the schemas at once: recovering will be quite simple loading the main dump in a development machine and then dump and restore just the schema needed: there is one additional step, but dumping all the schema seem faster than dumping them one by one.

对于备份问题(使用 pg_dump),使用一个数据库和多个模式可能会更好,一次转储所有模式:恢复将非常简单,在开发机器中加载主转储,然后转储和恢复所需的模式:有是一个额外的步骤,但转储所有模式似乎比一一转储它们更快。

UPDATE 2012

2012 年更新

Well, the application structure and design changed so much during those last two years. I'm still using the one db with many schemasapproach, but still, I have one database for each versionof my application:

嗯,在过去的两年中,应用程序的结构和设计发生了很大的变化。我仍在使用这种one db with many schemas方法,但是,对于我的应用程序的每个版本,我仍然有一个数据库:

Db myapp_01
    \_ my_customer_foo_schema
    \_ my_customer_bar_schema
Db myapp_02
    \_ my_customer_foo_schema
    \_ my_customer_bar_schema

For backups, I'm dumping each database regularly, and then moving the backups on the development server.

对于备份,我定期转储每个数据库,然后在开发服务器上移动备份。

I'm also using the PITR/WAL backup but, as I said before, it's not likely I'll have to restore all databaseat once... so it will probably be dismissed this year (in my situation is not the best approach).

我也在使用 PITR/WAL 备份,但正如我之前所说,我不太可能必须一次恢复所有数据库......所以它今年可能会被解雇(在我的情况下不是最好的方法)。

The one-db-many-schema approach worked very well for me since now, even if the application structure is totally changed:

从现在开始,一个数据库多模式方法对我来说非常有效,即使应用程序结构完全改变了:

I almost forgot: all of my databases/schemas will alwayshave the same structure!

我几乎忘记了:我所有的数据库/模式将始终具有相同的结构!

...now, every schema has its own structure that change dynamically reacting to users data flow.

...现在,每个模式都有自己的结构,可以根据用户数据流动态变化。

采纳答案by kquinn

A PostgreSQL "schema" is roughly the same as a MySQL "database". Having many databases on a PostgreSQL installation can get problematic; having many schemas will work with no trouble. So you definitely want to go with one database and multiple schemas within that database.

PostgreSQL“模式”与 MySQL“数据库”大致相同。在 PostgreSQL 安装上有许多数据库可能会出现问题;拥有许多模式将毫无问题地工作。因此,您肯定希望在该数据库中使用一个数据库和多个模式。

回答by Strae

Definitely, I'll go for the one-db-many-schemas approach. This allows me to dump all the database, but restore just one very easily, in many ways:

当然,我会采用一个数据库多模式方法。这使我可以转储所有数据库,但可以通过多种方式轻松恢复其中一个:

  1. Dump the db (all the schema), load the dump in a new db, dump just the schema I need, and restore back in the main db.
  2. Dump the schema separately, one by one (but I think the machine will suffer more this way - and I'm expecting like 500 schemas!)
  1. 转储数据库(所有模式),将转储加载到新数据库中,仅转储我需要的模式,然后在主数据库中恢复。
  2. 一个一个地单独转储模式(但我认为机器会以这种方式受到更多影响 - 我期待像 500 个模式!)

Otherwise, googling around I've seen that there is no auto-procedure to duplicate a schema (using one as a template), but many suggest this way:

否则,谷歌搜索我已经看到没有自动程序来复制模式(使用一个作为模板),但许多人建议这样:

  1. Create a template-schema
  2. When need to duplicate, rename it with new name
  3. Dump it
  4. Rename it back
  5. Restore the dump
  6. The magic is done.
  1. 创建模板模式
  2. 当需要复制时,用新名称重命名
  3. 倾倒它
  4. 重命名回来
  5. 恢复转储
  6. 魔法完成了。

I've written two rows in Python to do that; I hope they can help someone (in-2-seconds-written-code, don't use it in production):

我已经用 Python 写了两行来做到这一点;我希望他们可以帮助某人(在 2 秒内编写代码,不要在生产中使用它):

import os
import sys
import pg

# Take the new schema name from the second cmd arguments (the first is the filename)
newSchema = sys.argv[1]

# Temperary folder for the dumps
dumpFile = '/test/dumps/' + str(newSchema) + '.sql'

# Settings
db_name = 'db_name'
db_user = 'db_user'
db_pass = 'db_pass'
schema_as_template = 'schema_name'

# Connection
pgConnect = pg.connect(dbname= db_name, host='localhost', user= db_user, passwd= db_pass)

# Rename schema with the new name
pgConnect.query("ALTER SCHEMA " + schema_as_template + " RENAME TO " + str(newSchema))

# Dump it
command = 'export PGPASSWORD="' + db_pass + '" && pg_dump -U ' + db_user + ' -n ' + str(newSchema) + ' ' + db_name + ' > ' + dumpFile
os.system(command)

# Rename back with its default name
pgConnect.query("ALTER SCHEMA " + str(newSchema) + " RENAME TO " + schema_as_template)

# Restore the previous dump to create the new schema
restore = 'export PGPASSWORD="' + db_pass + '" && psql -U ' + db_user + ' -d ' + db_name + ' < ' + dumpFile
os.system(restore)

# Want to delete the dump file?
os.remove(dumpFile)

# Close connection
pgConnect.close()

回答by Strae

I would say, go with multiple databases AND multiple schemas :)

我会说,使用多个数据库和多个模式:)

Schemas in PostgreSQL are a lot like packages in Oracle, in case you are familiar with those. Databases are meant to differentiate between entire sets of data, while schemas are more like data entities.

PostgreSQL 中的模式很像 Oracle 中的包,如果您熟悉它们的话。数据库旨在区分整组数据,而模式更像是数据实体。

For instance, you could have one database for an entire application with the schemas "UserManagement", "LongTermStorage" and so on. "UserManagement" would then contain the "User" table, as well as all stored procedures, triggers, sequences, etc. that are needed for the user management.

例如,您可以为具有“UserManagement”、“LongTermStorage”等模式的整个应用程序创建一个数据库。然后,“UserManagement”将包含“User”表,以及用户管理所需的所有存储过程、触发器、序列等。

Databases are entire programs, schemas are components.

数据库是整个程序,模式是组件。

回答by emax

In a PostgreSQL context I recommend to use one db with multiple schemas, as you can (e.g.) UNION ALL across schemas, but not across databases. For that reason, a database is really completely insulated from another database while schemas are not insulated from other schemas within the same database.

在 PostgreSQL 上下文中,我建议使用一个具有多个模式的数据库,因为您可以(例如)跨模式 UNION ALL,但不能跨数据库。出于这个原因,一个数据库实际上与另一个数据库完全隔离,而模式与同一数据库中的其他模式并不隔离。

If you -for some reason- have to consolidate data across schemas in the future, it will be easy to do this over multiple schemas. With multiple databases you would need multiple db-connections and collect and merge the data from each database "manually" by application logic.

如果您(出于某种原因)将来必须跨模式合并数据,那么在多个模式上执行此操作将很容易。对于多个数据库,您将需要多个 db-connections 并通过应用程序逻辑“手动”收集和合并来自每个数据库的数据。

The latter have advantages in some cases, but for the major part I think the one-database-multiple-schemas approach is more useful.

后者在某些情况下具有优势,但对于主要部分,我认为单数据库多模式方法更有用。

回答by Troels Arvin

A number of schemas should be more lightweight than a number of databases, although I cannot find a reference which confirms this.

许多模式应该比许多数据库更轻量级,尽管我找不到证实这一点的参考。

But if you really want to keep things very separate (instead of refactoring the web application so that a "customer" column is added to your tables), you may still want to use separate databases: I assert that you can more easily make restores of a particular customer's database this way -- without disturbing the other customers.

但是,如果您真的想让事情非常独立(而不是重构 Web 应用程序以便将“客户”列添加到您的表中),您可能仍然希望使用单独的数据库:我断言您可以更轻松地恢复以这种方式特定客户的数据库 - 不会打扰其他客户。