SQL PostgreSQL 中的“ORDER BY ... USING”子句

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/7205878/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 11:54:16  来源:igfitidea点击:

"ORDER BY ... USING" clause in PostgreSQL

sqlpostgresqlsql-order-by

提问by markus

The ORDER BY clause is decribed in the PostgreSQLdocumentation as:

ORDER BY 子句在 PostgreSQL 文档中描述为:

ORDER BY expression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...]

Can someone give me some examples how to use the USING operator? Is it possible to get an alternating order of the resultset?

有人可以给我一些如何使用的例子USING operator吗?是否有可能获得结果集的交替顺序?

回答by A.H.

A very simple example would be:

一个非常简单的例子是:

> SELECT * FROM tab ORDER BY col USING <

But this is boring, because this is nothing you can't get with the traditional ORDER BY col ASC.

但这很无聊,因为这是传统ORDER BY col ASC.

Also the standard catalog doesn't mention anything exciting about strange comparison functions/operators. You can get a list of them:

此外,标准目录没有提到任何关于奇怪的比较函数/运算符的令人兴奋的事情。您可以获得它们的列表:

    > SELECT amoplefttype::regtype, amoprighttype::regtype, amopopr::regoper 
      FROM pg_am JOIN pg_amop ON pg_am.oid = pg_amop.amopmethod 
      WHERE amname = 'btree' AND amopstrategy IN (1,5);

You will notice, that there are mostly <and >functions for primitive types like integer, dateetc and some more for arrays and vectors and so on. None of these operators will help you to get a custom ordering.

你会发现,那些有大多<>功能的原始类型,如integerdate等和一些数组和向量等。这些操作员都不会帮助您获得自定义订购。

In mostcases where custom ordering is required you can get away using something like ... ORDER BY somefunc(tablecolumn) ...where somefuncmaps the values appropriately. Because that works with every database this is also the most common way. For simple things you can even write an expression instead of a custom function.

在需要自定义排序的大多数情况下,您可以使用诸如... ORDER BY somefunc(tablecolumn) ...wheresomefunc适当映射值之类的方法。因为这适用于每个数据库,这也是最常见的方式。对于简单的事情,您甚至可以编写表达式而不是自定义函数。

Switching gears up

换档

ORDER BY ... USINGmakes sense in several cases:

ORDER BY ... USING在几种情况下是有意义的:

  • The ordering is so uncommon, that the somefunctrick doesn't work.
  • You work with a non-primitive type (like point, circleor imaginary numbers) and you don't want to repeat yourself in your queries with strange calculations.
  • The dataset you want to sort is so large, that support by an index is desired or even required.
  • 排序是如此罕见,以至于这个somefunc技巧不起作用。
  • 您使用非原始类型(如pointcircle或虚数),并且您不想在查询中使用奇怪的计算重复自己。
  • 您要排序的数据集非常大,需要甚至需要索引的支持。

I will focus on the complex datatypes: often there is more than one way to sort them in a reasonable way. A good example is point: You can "order" them by the distance to (0,0), or by xfirst, then by yor just by yor anything else you want.

我将专注于复杂的数据类型:通常有不止一种方法可以以合理的方式对它们进行排序。一个很好的例子是point:您可以按到 (0,0) 的距离“排序”它们,或者先按x,然后按y或仅按y或您想要的任何其他东西。

Of course, PostgreSQL haspredefined operators for point:

当然,PostgreSQL预定义的操作符point

    > CREATE TABLE p ( p point );
    > SELECT p <-> point(0,0) FROM p;

But noneof them is declared usable for ORDER BYby default (see above):

但默认情况下,它们都没有被声明为可用ORDER BY(见上文):

    > SELECT * FROM p ORDER BY p;
    ERROR:  could not identify an ordering operator for type point
    TIP:  Use an explicit ordering operator or modify the query.

Simple operators for pointare the "below" and "above" operators <^and >^. They compare simply the ypart of the point. But:

的简单运算符point是“下方”和“上方”运算符<^>^。他们只是比较y点的一部分。但:

    >  SELECT * FROM p ORDER BY p USING >^;
    ERROR: operator > is not a valid ordering operator
    TIP: Ordering operators must be "<" or ">" members of __btree__ operator families.

ORDER BY USINGrequires an operator with defined semantics: Obviously it must be a binary operator, it must accept the same type as arguments and it must return boolean. I think it must also be transitive (if a < b and b < c then a < c). There may be more requirements. But all these requirements are also necessary for proper btree-index ordering. This explains the strange error messages containing the reference to btree.

ORDER BY USING需要一个具有定义语义的运算符:显然它必须是一个二元运算符,它必须接受与参数相同的类型,并且它必须返回布尔值。我认为它也必须是可传递的(如果 a < b 和 b < c 那么 a < c)。可能还有更多要求。但是所有这些要求对于正确的btree-index 排序也是必要的。这解释了包含对btree的引用的奇怪错误消息。

ORDER BY USINGalso requires not just one operatorto be defined but an operator classand an operator family. While one couldimplement sorting with only one operator, PostgreSQL tries to sort efficiently and minimize comparisons. Therefore, several operators are used even when you specify only one - the others must adhere to certain mathematical constraints - I've already mentioned transitivity, but there are more.

ORDER BY USING不仅需要定义一个运算符,还需要定义一个运算符类和一个运算符族。虽然可以只用一个运算符实现排序,但 PostgreSQL 试图有效地排序并尽量减少比较。因此,即使您只指定一个运算符,也会使用多个运算符——其他运算符必须遵守某些数学约束——我已经提到了传递性,但还有更多。

Switching Gears up

换档

Let's define something suitable: An operator for points which compares only the ypart.

让我们定义一些合适的东西:一个只比较y零件的点运算符。

The first step is to create a custom operator family which can be used by the btreeindex access method. see

第一步是创建一个可以被btree索引访问方法使用的自定义运算符族。

    > CREATE OPERATOR FAMILY xyzfam USING btree;   -- superuser access required!
    CREATE OPERATOR FAMILY

Next we must provide a comparator function which returns -1, 0, +1 when comparing two points. This function WILLbe called internally!

接下来,我们必须提供一个比较器函数,它在比较两点时返回 -1、0、+1。此函数在内部调用!

    > CREATE FUNCTION xyz_v_cmp(p1 point, p2 point) RETURNS int 
      AS $$BEGIN RETURN btfloat8cmp(p1[1],p2[1]); END $$ LANGUAGE plpgsql;
    CREATE FUNCTION

Next we define the operator class for the family. See the manualfor an explanation of the numbers.

接下来,我们为系列定义操作符类。有关数字的说明,请参阅手册

    > CREATE OPERATOR CLASS xyz_ops FOR TYPE point USING btree FAMILY xyzfam AS 
        OPERATOR 1 <^ ,
        OPERATOR 3 ?- ,
        OPERATOR 5 >^ ,
        FUNCTION 1 xyz_v_cmp(point, point) ;
    CREATE OPERATOR CLASS

This step combines several operators and functions and also defines their relationship and meaning. For example OPERATOR 1means: This is the operator for less-thantests.

这一步结合了几个运算符和函数,并定义了它们的关系和含义。例如OPERATOR 1意味着:这是less-than测试的运算符。

Now the operators <^and >^can be used in ORDER BY USING:

现在运算符<^>^可用于ORDER BY USING

> INSERT INTO p SELECT point(floor(random()*100), floor(random()*100)) FROM generate_series(1, 5);
INSERT 0 5
> SELECT * FROM p ORDER BY p USING >^;
    p    
---------
 (17,8)
 (74,57)
 (59,65)
 (0,87)
 (58,91)

Voila - sorted by y.

瞧 - 按y排序。

To sum it up:ORDER BY ... USINGis an interesting look under the hood of PostgreSQL. But nothing you will require anytime soon unless you work in veryspecific areas of database technology.

总结一下:ORDER BY ... USING是 PostgreSQL 引擎盖下的一个有趣的外观。但是,除非您在非常特定的数据库技术领域工作,否则您不会很快需要任何东西。

Another example can be found in the Postgres docs.with source code for the example hereand here. This example also shows how to create the operators.

另一个例子可以在 Postgres 文档中找到带有示例的源代码herehere。此示例还展示了如何创建运算符。

回答by J0HN

Samples:

样品:

CREATE TABLE test
(
  id serial NOT NULL,
  "number" integer,
  CONSTRAINT test_pkey PRIMARY KEY (id)
)

insert into test("number") values (1),(2),(3),(0),(-1);

select * from test order by number USING > //gives 3=>2=>1=>0=>-1

select * from test order by number USING < //gives -1=>0=>1=>2=>3

So, it is equivalent to descand asc. But you may use your own operator, that's the essential feature of USING

所以,它等价于descasc。但是你可以使用你自己的运营商,这是它的本质特征USING

回答by Алексей Лещук

Nice answers, but they didn't mentioned one real valuable case for ′USING′.

不错的答案,但他们没有提到“使用”的一个真正有价值的案例。

When you have created an index with non default operators family, for example varchar_pattern_ops( ~>~, ~<~, ~>=~, ... ) instead of <, >, >=then if you search based on index and you want to use index in order by clause you need to specify USINGwith the appropriate operator.

当您使用非默认运算符系列创建索引时,例如 varchar_pattern_ops( ~>~, ~<~, ~>=~, ... ) 而不是<, >>=那么如果您基于索引进行搜索并且您想在 order by 子句中使用索引,您需要USING使用适当的指定操作员。

This can be illustrated with such example:

这可以用这样的例子来说明:

CREATE INDEX index_words_word ON words(word text_pattern_ops); 

Lets compare this two queries:

让我们比较这两个查询:

SELECT * FROM words WHERE word LIKE 'o%' LIMIT 10;

and

SELECT * FROM words WHERE word LIKE 'o%' ORDER BY word LIMIT 10;

The difference between their executions is nearly 100 times in a 500K words DB! And also results may not be correct within non-C locale.

在 500K 字的数据库中,它们的执行差异接近 100 倍!而且结果在非 C 语言环境中也可能不正确。

How this could happend?

这怎么会发生?

When you making search with LIKEand ORDER BYclause, you actually make this call:

当您使用LIKEandORDER BY子句进行搜索时,您实际上是在进行以下调用:

SELECT * FROM words WHERE word ~>=~ 'o' AND word ~<~'p' ORDER BY word USING < LIMIT 10;

Your index created with ~<~operator in mind, so PG cannot use given index in a given ORDER BYclause. To get things done right query must be rewritten to this form:

您创建的索引~<~考虑到了运算符,因此 PG 不能在给定的ORDER BY子句中使用给定的索引。为了把事情做正确的查询必须改写为这种形式:

SELECT * FROM words WHERE word ~>=~ 'o' AND word ~<~'p' ORDER BY word USING ~<~ LIMIT 10;

or

或者

SELECT * FROM words WHERE word LIKE 'o%' ORDER BY word USING ~<~ LIMIT 10;

回答by VoodooChild

Optionally one can add the key word ASC (ascending) or DESC (descending) after any expression in the ORDER BY clause. If not specified, ASC is assumed by default. Alternatively, a specific ordering operator name can be specified in the USING clause. An ordering operator must be a less-than or greater-than member of some B-tree operator family. ASC is usually equivalent to USING < and DESC is usually equivalent to USING >.

可以选择在 ORDER BY 子句中的任何表达式后添加关键字 ASC(升序)或 DESC(降序)。如果未指定,默认情况下采用 ASC。或者,可以在 USING 子句中指定特定的排序运算符名称。排序运算符必须是某个 B 树运算符族的小于或大于成员。ASC 通常等同于 USING <,DESC 通常等同于 USING >。

PostgreSQL 9.0

PostgreSQL 9.0

It may look something like this I think (I don't have postgres to verify this right now, but will verify later)

我认为它可能看起来像这样(我现在没有 postgres 来验证这一点,但稍后会验证)

SELECT Name FROM Person
ORDER BY NameId USING >