JSON 上的 PostgreSQL 索引

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/36075918/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-03 18:19:25  来源:igfitidea点击:

PostgreSQL Index on JSON

jsonpostgresql

提问by lnhubbell

Using Postgres 9.4, I want to create an index on a json column that will be used when searching on specific keys within the column.

使用 Postgres 9.4,我想在 json 列上创建一个索引,该索引将在搜索列中的特定键时使用。

For example I have an 'farm' table with a json column 'animals'.

例如,我有一个带有 json 列“动物”的“农场”表。

The animals column has json objects of the general format:

动物列具有通用格式的 json 对象:

'{"cow": 2, "chicken": 11, "horse": 3}'

I have tried a number of indexes (separately):

我尝试了许多索引(单独):

(1) create INDEX animal_index ON farm ((animal ->> 'cow'));
(2) create INDEX animal_index ON farm using gin ((animal ->> 'cow'));
(3) create INDEX animal_index ON farm using gist ((animal ->> 'cow'));

I want to run queries like:

我想运行如下查询:

SELECT * FROM farm WHERE (animal ->> 'cow') > 3;

and have that query use the index.

并让该查询使用索引。

When I run this query:

当我运行此查询时:

SELECT * FROM farm WHERE (animal ->> 'cow') is null;

then the (1) index works, but I can't get any of the indexes to work for the inequality.

那么 (1) 索引有效,但我无法让任何索引适用于不等式。

Is such an index possible?

这样的索引可能吗?

The farm table contains only ~5000 farms, but some of them contain 100s of animals and the queries simply take too long for my use case. An index like this is the only method I can think of for speeding this query up, but perhaps there is another option.

农场表只包含约 5000 个农场,但其中一些包含 100 只动物,查询对我的用例来说太长了。像这样的索引是我能想到的加速此查询的唯一方法,但也许还有另一种选择。

回答by Erwin Brandstetter

Your other two indexes won't work simply because the ->>operatorreturns text, while you obviously have the jsonbgin operator classes in mind. Note that you only mention json, but you actually need jsonbfor advanced indexing capabilities.

您的其他两个索引不会仅仅因为->>运算符返回text而起作用,而您显然已经jsonb记住了gin 运算符类。请注意,您只提到了json,但实际上您需要jsonb高级索引功能。

To work out the best indexing strategy, you'd have to define more closely which queries to cover. Are you only interested in cows? Or all animals / all tags? Which operators are possible? Does your JSON document also include non-animal keys? What to do with those? Do you want to include rows in the index where cows (or whatever) don't show up in the JSON document at all?

要制定最佳索引策略,您必须更详细地定义要涵盖的查询。你只对牛感兴趣吗?还是所有动物/所有标签?哪些运算符是可能的?您的 JSON 文档是否还包含非动物键?拿那些怎么办?你想在索引中包含奶牛(或其他)根本不出现在 JSON 文档中的行吗?

Assuming:

假设:

  • We are only interested in cows at the first level of nesting.
  • The value is always a valid integer.
  • We are not interested in rows without cows.
  • 我们只对第一级筑巢的奶牛感兴趣。
  • 该值始终是有效的integer
  • 我们对没有奶牛的行不感兴趣。

I suggest a functional btree index, much like you already have, but cast the value to integer. I don't suppose you'd want the comparison evaluated as text(where '2' is greater than '1111').

我建议使用功能性 btree 索引,就像您已经拥有的那样,但将值转换为整数。我不认为您希望比较评估为text(其中“2”大于“1111”)。

CREATE INDEX animal_index ON farm (((animal ->> 'cow')::int));  -- !

The extra set of parentheses is required for the cast shorthand to make the syntax for the index expression unambiguous.

强制转换速记需要一组额外的括号,以使索引表达式的语法明确无误。

Use the same expression in your queries to make Postgres realize the index is applicable:

在查询中使用相同的表达式使 Postgres 意识到索引适用:

SELECT * FROM farm WHERE (animal ->> 'cow')::int > 3;

If you need a more generic jsonbindex, consider:

如果您需要更通用的jsonb索引,请考虑:

For a known, static, trivialnumber of animals (like you commented), I suggest partial indexes like:

对于已知的、静态的、微不足道的动物数量(如您所评论的),我建议使用部分索引,例如:

CREATE INDEX animal_index ON farm (((animal ->> 'cow')::int))
WHERE (animal ->> 'cow') IS NOT NULL;

CREATE INDEX animal_index ON farm (((animal ->> 'chicken')::int))
WHERE (animal ->> 'chicken') IS NOT NULL;

Etc.

等等。

You may have to add the index condition to the query:

您可能需要将索引条件添加到查询中:

SELECT * FROM farm
WHERE (animal ->> 'cow')::int > 3
AND   (animal ->> 'cow') IS NOT NULL; 

May seem redundant, but may be necessary. Test with ANALYZE!

可能看起来多余,但可能是必要的。测试ANALYZE