database 关系数据库中的目录和模式有什么区别?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7022755/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 08:18:19  来源:igfitidea点击:

What's the difference between a catalog and a schema in a relational database?

databaseschemacatalog

提问by Stephan

I used to think schema were the "upper wrapper" object before the database itself. I mean DB.schema.<what_ever_object_name_under_schema>.

我曾经认为模式是数据库本身之前的“上层包装”对象。我的意思是DB.schema.<what_ever_object_name_under_schema>

Well, the catalog "wrapper" is now quite confusing. Why should we need a catalog? For what purpose, precisely should the catalog be used?

好吧,目录“包装器”现在很混乱。为什么我们需要目录?究竟应该出于什么目的使用该目录?

采纳答案by Mike Sherrill 'Cat Recall'

From the relational point of view :

从关系的角度来看:

The catalog is the place where--among other things--all of the various schemas (external, conceptual, internal) and all of the corresponding mappings (external/conceptual, conceptual/internal) are kept.

In other words, the catalog contains detailed information (sometimes called descriptor informationor metadata) regarding the various objects that are of interest to the system itself.

For example, the optimizer uses catalog information about indexes and other physical storage structures, as well as much other information, to help it decide how to implement user requests. Likewise, the security subsystem uses catalog information about users and security constraints to grant or deny such requests in the first place.

目录是保存所有各种模式(外部、概念、内部)和所有相应映射(外部/概念、概念/内部)的地方。

换句话说,目录包含有关系统本身感兴趣的各种对象的详细信息(有时称为描述符信息元数据)。

例如,优化器使用有关索引和其他物理存储结构的目录信息以及许多其他信息来帮助它决定如何实现用户请求。同样,安全子系统首先使用有关用户和安全约束的目录信息来授予或拒绝此类请求。

An Introduction to Database Systems, 7th ed., C.J. Date, p 69-70.

数据库系统简介,第 7 版,CJ 日期,第 69-70 页。



从 SQL 标准的角度来看:

Catalogs are named collections of schemas in an SQL-environment. An SQL-environment contains zero or more catalogs. A catalog contains one or more schemas, but always contains a schema named INFORMATION_SCHEMA that contains the views and domains of the Information Schema.

目录是 SQL 环境中模式的命名集合。SQL 环境包含零个或多个目录。目录包含一个或多个模式,但始终包含名为 INFORMATION_SCHEMA 的模式,该模式包含信息模式的视图和域。

Database Language SQL, (Proposed revised text of DIS 9075), p 45

数据库语言 SQL,(DIS 9075 的拟议修订文本),第 45 页



从 SQL 的角度来看:

A catalog is often synonymous with database. In most SQL dbms, if you query the information_schema views, you'll find that values in the "table_catalog" column map to the name of a database.

If you find your platform using catalogin a broader way than any of these three definitions, it might be referring to something broader than a database--a database cluster, a server, or a server cluster. But I kind of doubt that, since you'd have found that easily in your platform's documentation.

目录通常是数据库的同义词。在大多数 SQL dbms 中,如果您查询 information_schema 视图,您会发现“table_catalog”列中的值映射到数据库的名称。

如果您发现您的平台以比这三个定义中的任何一个更广泛的方式使用目录,那么它可能指的是比数据库更广泛的东西——数据库集群、服务器或服务器集群。但我有点怀疑,因为您很容易在平台的文档中找到了这一点。

回答by Basil Bourque

Mike Sherrill 'Cat Recall'gave an excellent answer. I'll add simply one example: Postgres.

Mike Sherrill 'Cat Recall'给出了一个很好的答案。我将简单地添加一个示例:Postgres

Cluster = A Postgres Installation

集群 = Postgres 安装

When you install Postgres on a machine, that installation is called a cluster. ‘Cluster' here is not meant in the hardware senseof multiple computers working together. In Postgres, clusterrefers to the fact that you can have multiple unrelated databases all up and running using the same Postgres server engine.

在机器上安装 Postgres 时,该安装称为cluster。这里的“集群”并不是指多台计算机协同工作的硬件意义。在 Postgres 中,集群是指您可以使用相同的 Postgres 服务器引擎启动并运行多个不相关的数据库。

The word clusteris also defined by the SQLStandardin the same way as in Postgres. Closely following the SQL Standard is a primary goal of the Postgres project.

SQL标准也以与 Postgres 中相同的方式定义了cluster一词。密切遵循 SQL 标准是 Postgres 项目的主要目标。

The SQL-92specification says:

SQL-92规范说:

A cluster is an implementation-defined collection of catalogs.

集群是实现定义的目录集合。

and

Exactly one cluster is associated with an SQL-session

恰好一个集群与一个 SQL 会话相关联

That's an obtuse way of saying a cluster is a database server (each catalog is a database).

将集群说成是数据库服务器(每个目录都是一个数据库)是一种钝化的说法。

Cluster > Catalog > Schema > Table > Columns & Rows

集群 > 目录 > 模式 > 表 > 列和行

So in both Postgres and the SQL Standard we have this containment hierarchy:

因此,在 Postgres 和 SQL 标准中,我们都有这样的包含层次结构:

  • A computer may have one cluster or multiple.
  • A database server is a cluster.
  • A cluster has catalogs. ( Catalog = Database )
  • Catalogs have schemas. (Schema = namespaceof tables, and security boundary)
  • Schemas have tables.
  • Tables have rows.
  • Rows have values, defined by columns.
    Those values are the business data your apps and users care about such as person's name, invoice due date, product price, gamer's high score. The column defines the data typeof the values (text, date, number, and so on).
  • 一台计算机可能有一个或多个集群。
  • 数据库服务器是一个集群
  • 一个集群有目录。(目录 = 数据库)
  • 目录具有模式。(架构 =表的命名空间和安全边界)
  • 模式有
  • 表有
  • 行有,由定义。
    这些值是您的应用和用户关心的业务数据,例如人名、发票到期日、产品价格、游戏玩家的高分。该列定义值的数据类型(文本、日期、数字等)。

Diagram showing nesting boxes representing how connecting on a port gets you to cluster (a database server) which contains one or more Catalogs (a database) each of which contains one or more Schemas (a namespace) each of which contains tables each of which has rows.

Diagram showing nesting boxes representing how connecting on a port gets you to cluster (a database server) which contains one or more Catalogs (a database) each of which contains one or more Schemas (a namespace) each of which contains tables each of which has rows.

Multiple Clusters

多个集群

This diagram represents a single cluster. In the case of Postgres, you can have more than one cluster per host computer (or virtual OS). Multiple clusters is commonly done, for testing and deploying new versions of Postgres (ex: 9.0, 9.1, 9.2, 9.3, 9.4, 9.5).

此图表示单个集群。对于 Postgres,每台主机(或虚拟操作系统)可以有多个集群。多个集群通常用于测试和部署 Postgres 的新版本(例如:9.09.19.29.39.49.5)。

If you did have multiple clusters, imagine the diagram above duplicated.

如果您确实有多个集群,请想象上面的图表是重复的。

Different port numbers allow the multiple clusters to live side-by-side all up and running at the same time. Each cluster would be assigned its own port number. The usual 5432is only the default, and can be set by you. Each cluster is listening on its own assigned port for incoming database connections.

不同的端口号允许多个集群同时运行。每个集群都将被分配自己的端口号。通常5432的只是默认值,可以由您设置。每个集群都在其自己分配的端口上侦听传入的数据库连接。

Example Scenario

示例场景

For example, a company could have two different software development teams. One writes software to manage the warehouses while the other team builds software to manage sales and marketing. Each dev team has their own database, blissfully unaware of the other's.

例如,一家公司可能有两个不同的软件开发团队。一个团队编写软件来管理仓库,而另一个团队构建软件来管理销售和营销。每个开发团队都有自己的数据库,幸福地不知道其他人的。

But the IT operations team took a decision to run both databases on a single computer box (Linux, Mac, whatever). So on that box they installed Postgres. So one database server (database cluster). In that cluster, they create two catalogs, a catalog for each dev team: one named 'warehouse' and one named 'sales'.

但是 IT 运营团队决定在一台计算机(Linux、Mac 等)上运行这两个数据库。所以在那个盒子上他们安装了 Postgres。所以一台数据库服务器(数据库集群)。在该集群中,他们创建了两个目录,每个开发团队的目录:一个名为“仓库”,一个名为“销售”。

Each dev team uses many dozens of tables with different purposes and access roles. So each dev team organizes their tables into schemas. By coincidence, both dev teams do some tracking of accounting data, so each team happens to have a schema named 'accounting'. Using the same schema name is not a problem because the catalogs each have their own namespaceso no collision.

每个开发团队使用数十个具有不同目的和访问角色的表。因此,每个开发团队都将他们的表组织成模式。巧合的是,两个开发团队都对会计数据进行了一些跟踪,因此每个团队碰巧都有一个名为“会计”的模式。使用相同的模式名称不是问题,因为每个目录都有自己的命名空间,因此不会发生冲突。

Furthermore, each team eventually creates a table for accounting purposes named 'ledger'. Again, no naming collision.

此外,每个团队最终都会为会计目的创建一个名为“分类帐”的表。同样,没有命名冲突。

You can think of this example as a hierarchy…

您可以将此示例视为层次结构……

  • Computer (hardware box or virtualized server)
    • Postgres 9.2cluster (installation)
      • warehousecatalog (database)
        • inventoryschema
          • [… some tables]
        • accountingschema
          • ledgertable
          • [… some other tables]
      • salescatalog (database)
        • sellingschema
          • [… some tables]
        • accountingschema (coincidental same name as above)
          • ledgertable (coincidental same name as above)
          • [… some other tables]
    • Postgres 9.3cluster
      • [… other schemas & tables]
  • 计算机(硬件盒或虚拟化服务器)
    • Postgres 9.2集群(安装)
      • warehouse目录(数据库)
        • inventory模式
          • [……一些桌子]
        • accounting模式
          • ledger桌子
          • [……其他一些桌子]
      • sales目录(数据库)
        • selling模式
          • [……一些桌子]
        • accounting模式(与上面的巧合同名)
          • ledger表(与上面巧合的同名)
          • [……其他一些桌子]
    • Postgres 9.3
      • […其他模式和表]

Each dev team's software makes a connection to the cluster. When doing so, they must specify which catalog (database) is theirs. Postgres requires that you connect to one catalog, but you are not limited to that catalog. That initial catalog is merely a default, used when your SQL statements omit the name of a catalog.

每个开发团队的软件都与集群建立连接。这样做时,他们必须指定哪个目录(数据库)是他们的。Postgres 要求您连接到一个目录,但您不限于该目录。该初始目录只是一个默认值,在您的 SQL 语句省略目录名称时使用。

So if the dev team ever needs to access the other team's tables, they may do so ifthe database administrator has given them privilegesto do so. Access is made with explicit naming in the pattern: catalog.schema.table. So if the 'warehouse' team needs to see the other team's ('sales' team) ledger, they write SQL statements with sales.accounting.ledger. To access their own ledger, they merely write accounting.ledger. If they access both ledgers in the same piece of source code, they may choose to avoid confusion by including their own (optional) catalog name, warehouse.accounting.ledgerversus sales.accounting.ledger.

因此,如果开发团队需要访问其他团队的表,如果数据库管理员授予他们这样做的权限,他们可能会这样做。访问是通过模式中的显式命名进行的:catalog.schema.table。因此,如果“仓库”团队需要查看其他团队(“销售”团队)的分类帐,他们会使用sales.accounting.ledger. 要访问自己的分类帐,他们只需编写accounting.ledger. 如果他们在同一段源代码中访问两个分类帐,他们可能会选择通过包含他们自己的(可选)目录名称来避免混淆,warehouse.accounting.ledger而不是sales.accounting.ledger.



By the way…

顺便一提…

You may hear the word schemaused in a more general sense, meaning the entire design of a particular database's table structure. By contrast, in the SQL Standard the word means specifically the particular layer in the Cluster > Catalog > Schema > Tablehierarchy.

您可能会听到更一般意义上使用的“模式”一词,意思是特定数据库表结构的整个设计。相比之下,在 SQL 标准中,这个词特指Cluster > Catalog > Schema > Table层次结构中的特定层。

Postgres uses both the word databaseas well as catalogin various places such as the CREATE DATABASEcommand.

Postgres在诸如CREATE DATABASE命令之类的不同地方使用词数据库目录

Not all database system provides this full hierarchy of Cluster > Catalog > Schema > Table. Some have only a single catalog (database). Some have no schema, just one set of tables. Postgres is an exceptionally powerful product.

并非所有数据库系统都提供这种完整的Cluster > Catalog > Schema > Table. 有些只有一个目录(数据库)。有些没有模式,只有一组表。Postgres 是一个非常强大的产品。