PostgreSQL 引入的 JSONB 说明

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22654170/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 17:13:50  来源:igfitidea点击:

Explanation of JSONB introduced by PostgreSQL

jsonpostgresqlnosqlpostgresql-jsonjsonb

提问by Peeyush

PostgreSQL just introduced JSONBand it's already trending on hacker news. It would be great if someone could explain how it's different from Hstore and JSON previously present in PostgreSQL. What are its advantages and limitations and when should someone consider using it?

PostgreSQL 刚刚引入了JSONB,它已经成为黑客新闻的趋势。如果有人能解释它与 PostgreSQL 中先前存在的 Hstore 和 JSON 有何不同,那就太好了。它的优点和局限性是什么,什么时候应该考虑使用它?

回答by pozs

First, hstoreis a contrib module, which only allows you to store key => value pairs, where keys and values can only be texts (however values can be sql NULLs too).

首先,hstore是一个 contrib 模块,它只允许您存储键 => 值对,其中键和值只能是texts(但值也可以是 sql NULLs)。

Both json& jsonballows you to store a valid JSON value(defined in its spec).

这两个jsonjsonb允许你存储一个有效的JSON(在其定义的规范)。

F.ex. these are valid JSON representations: null, true, [1,false,"string",{"foo":"bar"}], {"foo":"bar","baz":[null]}- hstoreis just a little subset compared to what JSON is capable (but if you only need this subset, it's fine).

例如 这些是有效的 JSON 表示:null, true, [1,false,"string",{"foo":"bar"}], {"foo":"bar","baz":[null]}-hstore与 JSON 的功能相比只是一个小子集(但如果您只需要这个子集,那很好)。

The only difference between json& jsonbis their storage:

json&之间的唯一区别jsonb是它们的存储:

  • jsonis stored in its plain text format, while
  • jsonbis stored in some binary representation
  • json以纯文本格式存储,而
  • jsonb以某种二进制表示形式存储

There are 3 major consequences of this:

这样做有3个主要后果:

  • jsonbusually takes more disk space to store than json(sometimes not)
  • jsonbtakes more time to build from its input representation than json
  • jsonoperations take significantlymore time than jsonb(& parsing also needs to be done each time you do some operation at a jsontyped value)
  • jsonb通常比json(有时不是)需要更多的磁盘空间来存储
  • jsonb从输入表示中构建所需的时间比 json
  • json操作花费的时间明显多于jsonb(每次对json类型值执行某些操作时还需要进行解析)

When jsonbwill be available with a stable release, there will be two major use cases, when you can easily select between them:

jsonb稳定版本可用时,将有两个主要用例,您可以轻松地在它们之间进行选择:

  1. If you only work with the JSON representation in your application, PostgreSQL is only used to store & retrieve this representation, you should use json.
  2. If you do a lot of operations on the JSON value in PostgreSQL, or use indexing on some JSON field, you should use jsonb.
  1. 如果您只在应用程序中使用 JSON 表示,那么 PostgreSQL 仅用于存储和检索此表示,您应该使用json.
  2. 如果您对 PostgreSQL 中的 JSON 值进行大量操作,或者对某些 JSON 字段使用索引,则应该使用jsonb.

回答by FuzzyChef

Peeyush:

佩尤什:

The short answer is:

简短的回答是:

  • If you are doing a lot of JSON manipulation insidePostgreSQL, such as sorting, slicing, splicing, etc., you should use JSONB for speed reasons.
  • If you need indexed lookups for arbitrary key searches on JSON, then you should use JSONB.
  • If you are doing neither of the above, you should probably use JSON.
  • If you need to preserve key ordering, whitespace, and duplicate keys, you should use JSON.
  • 如果你PostgreSQL内部做很多 JSON 操作,比如排序、切片、拼接等,你应该使用 JSONB 来提高速度。
  • 如果您需要对 JSON 上的任意键搜索进行索引查找,那么您应该使用 JSONB。
  • 如果您没有执行上述任何一项操作,则您可能应该使用 JSON。
  • 如果您需要保留键排序、空格和重复键,则应使用 JSON。

For a longer answer, you'll need to wait for me to do a full "HowTo" writeup closer to the 9.4 release.

要获得更长的答案,您需要等待我在接近 9.4 版本时完成完整的“操作方法”文章。

回答by ChelowekKot

A simple explanation of the difference between json and jsonb (original image by PostgresProfessional):

json 和 jsonb 区别的简单解释(PostgresProfessional 的原图):

SELECT '{"c":0,   "a":2,"a":1}'::json, '{"c":0,   "a":2,"a":1}'::jsonb;

          json          |        jsonb 
------------------------+--------------------- 
 {"c":0,   "a":2,"a":1} | {"a": 1, "c": 0} 
(1 row)
  • json: textual storage ?as is?
  • jsonb: no whitespaces
  • jsonb: no duplicate keys, last key win
  • jsonb: keys are sorted
  • json:文本存储?原样?
  • jsonb:没有空格
  • jsonb:没有重复的键,最后一个键获胜
  • jsonb:键已排序

More in speech videoand slide show presentationby jsonb developers. Also they introduced JsQuery, pg.extension provides powerful jsonb query language

jsonb 开发人员提供的更多语音视频幻灯片演示。他们还引入了JsQuery, pg.extension 提供了强大的 jsonb 查询语言

回答by Ivan Voras

  • hstoreis more of a "wide column" storage type, it is a flat (non-nested) dictionary of key-value pairs, always stored in a reasonably efficient binary format (a hash table, hence the name).
  • jsonstores JSON documents as text, performing validation when the documents are stored, and parsing them on output if needed (i.e. accessing individual fields); it should support the entire JSON spec. Since the entire JSON text is stored, its formatting is preserved.
  • jsonbtakes shortcuts for performance reasons: JSON data is parsed on input and stored in binary format, key orderings in dictionaries are not maintained, and neither are duplicate keys. Accessing individual elements in the JSONB field is fast as it doesn't require parsing the JSON text all the time. On output, JSON data is reconstructed and initial formatting is lost.
  • hstore更像是“宽列”存储类型,它是键值对的平面(非嵌套)字典,始终以合理高效的二进制格式(哈希表,因此得名)存储。
  • json将 JSON 文档存储为文本,在存储文档时执行验证,并在需要时在输出时解析它们(即访问单个字段);它应该支持整个 JSON 规范。由于存储了整个 JSON 文本,因此保留了其格式。
  • jsonb出于性能原因采取快捷方式:JSON 数据在输入时被解析并以二进制格式存储,字典中的键顺序没有维护,也没有重复键。访问 JSONB 字段中的单个元素很快,因为它不需要一直解析 JSON 文本。输出时,JSON 数据被重建,初始格式丢失。

IMO, there is no significant reason for notusing jsonbonce it is available, if you are working with machine-readable data.

IMO,如果您正在使用机器可读的数据,那么一旦可用,就没有重要的理由使用jsonb

回答by subodhkarwa

JSONB is a "better" version of JSON.

JSONB 是 JSON 的“更好”版本。

Let's look at an example:

让我们看一个例子:

SELECT '{"c":0,   "a":2,"a":1}'::json, '{"c":0,   "a":2,"a":1}'::jsonb;
          json          |        jsonb 
------------------------+--------------------- 
 {"c":0,   "a":2,"a":1} | {"a": 1, "c": 0} 
(1 row)
  1. JSON stores white space, they is why we can see spaces when key "a" is stored, while JSONB does not.
  2. JSON stores all the values of key. This is the reason you can see multiple values (2 and 1) against the key "a" , while JSONB only "stores" the last value.
  3. JSON maintains the order in which elements are inserted, while JSONB maintains the "sorted" order.
  4. JSONB objects are stored as decompressed binary as opposed to "raw data" in JSON , where no reparsing of data is required during retrieval.
  5. JSONB also supports indexing, which can be a significant advantage.
  1. JSON 存储空格,这就是为什么我们可以在存储键“a”时看到空格,而 JSONB 则不能。
  2. JSON 存储键的所有值。这就是您可以针对键 "a" 看到多个值(2 和 1)的原因,而 JSONB 仅“存储”最后一个值。
  3. JSON 维护插入元素的顺序,而 JSONB 维护“排序”顺序。
  4. JSONB 对象存储为解压缩的二进制文件,而不是 JSON 中的“原始数据”,在检索过程中不需要重新解析数据。
  5. JSONB 还支持索引,这是一个显着的优势。

In general, one should prefer JSONB , unless there are specialized needs, such as legacy assumptions about ordering of object keys.

一般来说,人们应该更喜欢 JSONB ,除非有特殊需求,例如关于对象键排序的遗留假设。

回答by John

I was at the pgopen today benchmarks are way faster than mongodb, I believe it was around 500% faster for selects. Pretty much everything was faster at least by at 200% when contrasted with mongodb, than one exception right now is a update which requires completely rewriting the entire json column something mongodb handles better.

我今天在 pgopen 上的基准测试比 mongodb 快得多,我相信它的选择速度大约快 500%。与 mongodb 相比,几乎所有东西都至少快了 200%,而现在的一个例外是更新,它需要完全重写整个 json 列,mongodb 处理得更好。

The gin indexing on on jsonb sounds amazing.

jsonb 上的杜松子酒索引听起来很棒。

Also postgres will persist types of jsonb internally and basically match this with types such as numeric, text, boolean etc.

此外,postgres 将在内部保留 jsonb 的类型,并基本上将其与数字、文本、布尔值等类型相匹配。

Joins will also be possible using jsonb

也可以使用 jsonb 进行连接

Add PLv8 for stored procedures and this will basically be a dream come true for node.js developers.

为存储过程添加 PLv8,这对于 node.js 开发人员来说基本上是梦想成真。

Being it's stored as binary jsonb will also strip all whitespace, change the ordering of properties and remove duplicate properties using the last occurance of the property.

由于它存储为二进制 jsonb 也将删除所有空格,更改属性的顺序并使用属性的最后一次出现删除重复的属性。

Besides the index when querying against a jsonb column contrasted to a json column postgres doesn't have to actually run the functionality to convert the text to json on every row which will likely save a vast amount of time alone.

除了查询与 json 列对比的 jsonb 列时的索引之外,postgres 不必实际运行将文本转换为每一行上的 json 的功能,这可能会单独节省大量时间。

回答by Dinei

Regarding the differences between jsonand jsonbdatatypes, it worth mentioning the official explanation:

关于jsonjsonb数据类型的区别,值得一提的是官方的解释:

PostgreSQL offers two types for storing JSON data: jsonand jsonb. To implement efficient query mechanisms for these data types, PostgreSQL also provides the jsonpath data type described in Section 8.14.6.

The jsonand jsonbdata types accept almost identical sets of values as input. The major practical difference is one of efficiency. The jsondata type stores an exact copy of the input text, which processing functions must reparse on each execution; while jsonbdata is stored in a decomposed binary format that makes it slightly slower to input due to added conversion overhead, but significantly faster to process, since no reparsing is needed. jsonbalso supports indexing, which can be a significant advantage.

Because the jsontype stores an exact copy of the input text, it will preserve semantically-insignificant white space between tokens, as well as the order of keys within JSON objects. Also, if a JSON object within the value contains the same key more than once, all the key/value pairs are kept. (The processing functions consider the last value as the operative one.) By contrast, jsonbdoes not preserve white space, does not preserve the order of object keys, and does not keep duplicate object keys. If duplicate keys are specified in the input, only the last value is kept.

In general, most applications should prefer to store JSON data as jsonb, unless there are quite specialized needs, such as legacy assumptions about ordering of object keys.

PostgreSQL allows only one character set encoding per database. It is therefore not possible for the JSON types to conform rigidly to the JSON specification unless the database encoding is UTF8. Attempts to directly include characters that cannot be represented in the database encoding will fail; conversely, characters that can be represented in the database encoding but not in UTF8 will be allowed.

PostgreSQL 提供两种类型来存储 JSON 数据:jsonjsonb. 为了对这些数据类型实现高效的查询机制,PostgreSQL 还提供了 8.14.6节中描述的 jsonpath 数据类型。

jsonjsonb数据类型接受值作为输入的几乎相同的集。主要的实际差异是效率之一。该 json数据类型存储所输入的文本,该文本处理功能必须在每次执行时重新分析的精确副本; 虽然jsonb数据以分解的二进制格式存储,由于增加了转换开销,这使得输入速度稍慢,但处理速度明显更快,因为不需要重新解析。jsonb还支持索引,这是一个显着的优势。

因为该json类型存储输入文本的精确副本,所以它将保留标记之间语义无关紧要的空格,以及 JSON 对象中键的顺序。此外,如果值中的 JSON 对象多次包含相同的键,则会保留所有键/值对。(处理函数将最后一个值视为操作值。)相比之下,jsonb不保留空白,不保留对象键的顺序,也不保留重复的对象键。如果在输入中指定了重复键,则只保留最后一个值。

一般来说,大多数应用程序应该更喜欢将 JSON 数据存储为 jsonb,除非有非常特殊的需求,例如关于对象键排序的遗留假设。

PostgreSQL 只允许每个数据库使用一种字符集编码。因此,除非数据库编码为 UTF8,否则 JSON 类型不可能严格符合 JSON 规范。尝试直接包含无法在数据库编码中表示的字符将失败;相反,可以用数据库编码表示但不能用 UTF8 表示的字符将被允许。

Source: https://www.postgresql.org/docs/current/datatype-json.html

来源:https: //www.postgresql.org/docs/current/datatype-json.html

回答by vlasiak

Another important difference, that wasn't mentioned in any answer above, is that there is no equality operator for jsontype, but there is one for jsonb.

上面的任何答案中都没有提到的另一个重要区别是,json类型没有相等运算符,但有一个 for jsonb

This means that you can't use DISTINCTkeyword when selecting this json-type and/or other fields from a table (you can use DISTINCT ONinstead, but it's not always possible because of cases like this).

这意味着DISTINCTjson从表中选择此-type 和/或其他字段时不能使用关键字(您可以DISTINCT ON改用,但由于这种情况,并不总是可行)。

回答by erik swedberg

As far as I can tell,

据我所知,

  • hstore as it currently exists (in Postgresql 9.3) does not allow for nesting other objects and arrays as the values of its key/value pairs. however a future hstore patch will allow for nesting. this patch will not be in the 9.4 release and may not be included any time soon.

  • json as it currently exists doesallow for nesting, but is text-based, and does not allow for indexing, thus it is "slow"

  • jsonb that will be released with 9.4 will have the current nesting capabilities of json, as well as the GIN/GIST indexing of hstore, so it will be fast

  • hstore 目前存在(在 Postgresql 9.3 中)不允许嵌套其他对象和数组作为其键/值对的值。然而,未来的 hstore 补丁将允许嵌套。此补丁将不会出现在 9.4 版本中,并且可能不会很快包含在内。

  • 当前存在的 .json确实允许嵌套,但基于文本,并且不允许索引,因此它是“慢的”

  • 9.4 发布的 jsonb 将拥有 json 的当前嵌套能力,以及 hstore 的 GIN/GIST 索引,所以会很快

People working on postgresql 9.4 seem to be saying that the new, fast jsonb type will appeal to people who would have chosen to use a noSQL data store like MongoDB, but can now combine a relational database with query-able unstructured data under one roof

在 postgresql 9.4 上工作的人似乎在说,新的、快速的 jsonb 类型会吸引那些选择使用像 MongoDB 这样的 noSQL 数据存储的人,但现在可以将关系数据库与可查询的非结构化数据结合在一个屋檐下

http://www.databasesoup.com/2014/02/why-hstore2jsonb-is-most-important.html

http://www.databasesoup.com/2014/02/why-hstore2jsonb-is-most-important.html

Benchmarks of postgresql 9.4 jsonb seem to be on par with or in some cases faster than MongoDB

postgresql 9.4 jsonb 的基准测试似乎与 MongoDB 相当或在某些情况下比 MongoDB 更快

http://texture.io/alphabetum/postgresql-incl-hstore-vs-mongodb

http://texture.io/alphabetum/postgresql-incl-hstore-vs-mongodb