Ruby-on-rails 与 Postgresql JSON 数据列不同

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23509740/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-02 23:37:05  来源:igfitidea点击:

Distinct on Postgresql JSON data column

ruby-on-railsruby-on-rails-3postgresqlpostgresql-json

提问by Mohamed El Mahallawy

Trying to do distinct on a mode with rails.

试图在带导轨的模式上做不同的事情。

2.1.1 :450 > u.profiles.select("profiles.*").distinct


Profile Load (0.9ms)  SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" =   [["user_id", 2]]
PG::UndefinedFunction: ERROR:  could not identify an equality operator for type json
LINE 1: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integ...
                        ^
: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" = 
ActiveRecord::StatementInvalid: PG::UndefinedFunction: ERROR:  could not identify an equality operator for type json
LINE 1: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integ...
                        ^
: SELECT DISTINCT profiles.* FROM "profiles" INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" WHERE "integrations"."user_id" = 
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/rack-mini-profiler-0.9.1/lib/patches/sql_patches.rb:109:in `prepare'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/rack-mini-profiler-0.9.1/lib/patches/sql_patches.rb:109:in `prepare'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:834:in `prepare_statement'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:795:in `exec_cache'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql/database_statements.rb:139:in `block in exec_query'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract_adapter.rb:442:in `block in log'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activesupport-4.0.4/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract_adapter.rb:437:in `log'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql/database_statements.rb:137:in `exec_query'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/postgresql_adapter.rb:908:in `select'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract/database_statements.rb:32:in `select_all'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/connection_adapters/abstract/query_cache.rb:63:in `select_all'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/querying.rb:36:in `find_by_sql'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:585:in `exec_queries'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/association_relation.rb:15:in `exec_queries'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:471:in `load'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:220:in `to_a'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/activerecord-4.0.4/lib/active_record/relation.rb:573:in `inspect'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands/console.rb:90:in `start'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands/console.rb:9:in `start'
    from /Users/mmahalwy/.rvm/gems/ruby-2.1.1/gems/railties-4.0.4/lib/rails/commands.rb:62:in `<top (required)>'
    from bin/rails:4:in `require'
    from bin/rails:4:in `<main>'2.1.1 :451 > 

Getting an error PG::UndefinedFunction: ERROR: could not identify an equality operator for type json

出错 PG::UndefinedFunction: ERROR: could not identify an equality operator for type json

Converting to Hstore is not an option for me in this case. Any work arounds?

在这种情况下,转换为 Hstore 对我来说不是一个选择。任何解决方法?

回答by pozs

The reason behind this, is that in PostgreSQL (up to 9.3) there is no equality operator defined for json(i.e. val1::json = val2::jsonwill always throw this exception) -- in 9.4 there will be one for the jsonbtype.

这背后的原因是,在 PostgreSQL(最高 9.3)中没有定义相等运算符json(即val1::json = val2::json总是抛出此异常)——在 9.4 中将有一个用于该jsonb类型。

One workaround is, you can cast your jsonfield to text. But that won't cover all json equalitions. f.ex. {"a":1,"b":2}should be equal to {"b":2,"a":1}, but won't be equal if casted to text.

一种解决方法是,您可以将您的json字段转换为text. 但这不会涵盖所有 json 等式。例如 {"a":1,"b":2}应该等于{"b":2,"a":1},但如果强制转换为,则不会等于text

Another workaround is (if you have a primary key for that table -- which should be) you can use the DISTINCT ON (<expressions>)form:

另一个解决方法是(如果您有该表的主键 - 应该是)您可以使用以下DISTINCT ON (<expressions>)表单

u.profiles.select("DISTINCT ON (profiles.id) profiles.*")

Note: One known caveat for DISTINCT ON:

注意:一个已知的警告DISTINCT ON

The DISTINCT ON expression(s) must match the leftmost ORDER BY expression(s). The ORDER BY clause will normally contain additional expression(s) that determine the desired precedence of rows within each DISTINCT ON group.

DISTINCT ON 表达式必须匹配最左边的 ORDER BY 表达式。ORDER BY 子句通常包含附加表达式,用于确定每个 DISTINCT ON 组中行的所需优先级。

回答by poshest

Sorry I'm late on this answer, but it might help others.

对不起,我在这个答案上迟到了,但它可能会帮助其他人。

As I understand your query, you're only getting possible duplicates on profilesbecause of the many-to-many join to integrations(which you're using to determine which profilesto access).

据我了解您的查询,profiles由于多对多连接integrations(您用于确定profiles访问哪个),您只会获得可能的重复项。

Because of that, you can use a new GROUP BYfeature as of 9.1:

因此,您可以使用9.1的新GROUP BY功能:

When GROUP BY is present, it is not valid for the SELECT list expressions to refer to ungrouped columns except within aggregate functions or if the ungrouped column is functionally dependent on the grouped columns, since there would otherwise be more than one possible value to return for an ungrouped column. A functional dependency exists if the grouped columns (or a subset thereof) are the primary key of the table containing the ungrouped column.

当存在 GROUP BY 时,SELECT 列表表达式引用未分组的列是无效的,除非在聚合函数内或者未分组的列在功能上依赖于分组的列,否则将有多个可能的值返回未分组的列。如果分组列(或其子集)是包含未分组列的表的主键,则存在函数依赖。

So in your case, you could get Ruby to create the query (sorry, I don't know the Ruby syntax you're using)...

因此,在您的情况下,您可以让 Ruby 创建查询(抱歉,我不知道您使用的 Ruby 语法)...

SELECT profiles.* 
FROM "profiles" 
  INNER JOIN "integration_profiles" ON "profiles"."id" = "integration_profiles"."profile_id" 
  INNER JOIN "integrations" ON "integration_profiles"."integration_id" = "integrations"."id" 
WHERE "integrations"."user_id" = 
GROUP BY "profiles"."id"

I only removed the DISTINCTfrom your SELECTclause and added the GROUP BY.

我只DISTINCT从您的SELECT子句中删除了并添加了GROUP BY.

By referring ONLY to the idin the GROUP BY, you take advantage of that new feature because all the remaining profilescolumns are "functionally dependent" on that id primary key.

通过仅引用id中的GROUP BY,您可以利用该新功能,因为所有剩余的profiles列“功能上依赖”于该 id 主键。

Somehow, wonderfully that avoids the need for Postgres to do equality checks on the dependent columns (ie your jsoncolumn in this case).

不知何故,奇妙的是避免了 Postgres 对依赖列(即json在这种情况下您的列)进行相等性检查的需要。

The DISTINCT ONsolution is also great, and clearly sufficient in your case, but you can't use aggregate functions like array_aggwith it. You CAN with this GROUP BYapproach. Happy days! :)

DISTINCT ON解决方案也很棒,并且在您的情况下显然足够了,但是您不能像这样使用聚合函数array_agg。你可以用这种GROUP BY方法。快乐的时光!:)

回答by emesika

If you use PG 9.4 , using JSONB rather than JSON solves this problem Example :

如果您使用 PG 9.4 ,则使用 JSONB 而不是 JSON 可以解决此问题示例:

-- JSON datatype test 

create table t1 (id int, val json);
insert into t1 (id,val) values (1,'{"name":"value"}');
insert into t1 (id,val) values (1,'{"name":"value"}');
insert into t1 (id,val) values (2,'{"key":"value"}');
select * from t1 order by id;
select distinct * from t1 order by id;

-- JSONB datatype test 

create table t2 (id int, val jsonb);
insert into t2 (id,val) values (1,'{"name":"value"}');
insert into t2 (id,val) values (1,'{"name":"value"}');
insert into t2 (id,val) values (2,'{"key":"value"}');

select * from t2 order by id;

select distinct * from t2 order by id;

Result of running the above script :

CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
1 | {"name":"value"}
1 | {"name":"value"}
2 | {"key":"value"}

ERROR:  could not identify an equality operator for type json
LINE 1: select distinct * from t1 order by id;
                    ^
CREATE TABLE
INSERT 0 1
INSERT 0 1
INSERT 0 1
1 | {"name": "value"}
1 | {"name": "value"}
2 | {"key": "value"}

1 | {"name": "value"}
2 | {"key": "value"}

As you can see PG succeeded to imply DISTINCT on a JSONB column while it fails on a JSON column !

正如您所看到的,PG 成功地在 JSONB 列上暗示了 DISTINCT,而在 JSON 列上却失败了!

Try also the following to see that actually keys in the JSONB are sorted :

还可以尝试以下操作以查看 JSONB 中的实际键是否已排序:

insert into t2 values (3, '{"a":"1", "b":"2"}');
insert into t2 values (3, '{"b":"2", "a":"1"}');
select * from t2;

1 | {"name": "value"}
1 | {"name": "value"}
2 | {"key": "value"}
3 | {"a": "1", "b": "2"}
3 | {"a": "1", "b": "2"}

note that '{"b":"2", "a":"1"}' was inserted as '{"a":"1", "b":"2"}' therefor PG identifies that as the same record :

请注意,'{"b":"2", "a":"1"}' 被插入为 '{"a":"1", "b":"2"}' 因此 PG 将其标识为相同记录 :

select distinct * from t2;
3 | {"a": "1", "b": "2"}
2 | {"key": "value"}
1 | {"name": "value"}

回答by bright

Yeah, unfortunately postgres jsondoesn't implement equality, but jsonbdoes. So migrating jsoncolumns to jsonband it should work okay.

是的,不幸的是 postgresjson没有实现平等,但jsonb确实如此。所以将json列迁移到jsonb它应该可以正常工作。