oracle 按字母顺序对 varchar2 中的字符进行排序

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/2835634/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-18 20:38:33  来源:igfitidea点击:

Sort chars in varchar2 alphabetically

oracle

提问by Timour

I'm looking for a function that would sort chars in varchar2 alphabetically.

我正在寻找一个可以按字母顺序对 varchar2 中的字符进行排序的函数。

Is there something built-in into oracle that I can use or I need to create custom in PL/SQL ?

我可以使用 oracle 内置的东西,或者我需要在 PL/SQL 中创建自定义?

采纳答案by Alex Poole

From an answer at http://forums.oracle.com/forums/thread.jspa?messageID=1791550this might work, but don't have 10g to test on...

http://forums.oracle.com/forums/thread.jspa?messageID=1791550 上的答案来看,这可能有效,但没有 10g 来测试...

SELECT MIN(permutations)
FROM (SELECT REPLACE (SYS_CONNECT_BY_PATH (n, ','), ',') permutations
    FROM (SELECT LEVEL l, SUBSTR ('&col', LEVEL, 1) n
        FROM DUAL
        CONNECT BY LEVEL <= LENGTH ('&col')) yourtable
    CONNECT BY NOCYCLE l != PRIOR l)
WHERE LENGTH (permutations) = LENGTH ('&col')

In the example colis defined in SQL*Plus, but if you make this a function you can pass it in, or could rework it to take a table column directly I suppose.

在这个例子中col是在 SQL*Plus 中定义的,但是如果你把它变成一个函数,你可以把它传入,或者我想可以直接修改它以获取一个表列。

I'd take that as a start point rather than a solution; the original question was about anagrams so it's designed to find all permutations, so something similar but simplified might be possible. I suspect this doesn't scale very well for large values.

我认为这是一个起点而不是解决方案;最初的问题是关于字谜,因此它旨在查找所有排列,因此可能会出现类似但简化的情况。我怀疑这对于大值不能很好地扩展。

回答by Timour

So eventually I went PL/SQL route, because after searching for some time I realized that there is no build-in function that I can use.

所以最终我走PL/SQL路线,因为搜索了一段时间后我意识到没有可以使用的内置函数。

Here is what I came up with. Its based on the future of associative array which is that Oracle keeps the keys in sorted order.

这是我想出的。它基于关联数组的未来,即 Oracle 按排序顺序保存键。

create or replace function sort_chars(p_string in varchar2) return varchar deterministic
as
     rv varchar2(4000);
     ch  varchar2(1);
     type vcArray is table of varchar(4000) index by varchar2(1);
     sorted vcArray;

     key varchar2(1);

begin
     for i in 1 .. length(p_string)
     loop
        ch := substr(p_string, i, 1);

        if (sorted.exists(ch))
        then 
            sorted(ch) := sorted(ch) || ch;
        else
            sorted(ch) := ch;
        end if;
     end loop;


    rv := '';
    key  := sorted.FIRST;
    WHILE key IS NOT NULL LOOP
        rv := rv || sorted(key);
        key := sorted.NEXT(key);
    END LOOP;

     return rv;
end;

Simple performance test:

简单的性能测试:

set timing on;

create table test_sort_fn as 
select t1.object_name || rownum as test from user_objects t1, user_objects t2;

select count(distinct test) from  test_sort_fn;

select count (*)  from (select sort_chars(test)  from test_sort_fn);


Table created.
Elapsed: 00:00:01.32

COUNT(DISTINCTTEST)
-------------------
             384400
1 row selected.
Elapsed: 00:00:00.57

  COUNT(*)
----------
    384400
1 row selected.
Elapsed: 00:00:00.06

回答by mewa

You could use the following query:

您可以使用以下查询:

select listagg(letter) 
    within group (order by UPPER(letter), ASCII(letter) DESC) 
from
(
select regexp_substr('gfedcbaGFEDCBA', '.', level) as letter from dual
connect by regexp_substr('gfedcbaGFEDCBA', '.', level) is not null
);

The subquery splits the string into records (single character each) using regexp_substr, and the outer query merges the records into one string using listagg, after sorting them.

子查询拆分使用串入记录(每个单个字符)REGEXP_SUBSTR,和外部查询使用合并记录成一个字符串LISTAGG,对它们进行排序之后。

Here you should be careful, because alphabetical sorting depends on your database configuration, as Cine pointed.

在这里你应该小心,因为字母排序取决于你的数据库配置,正如 Cine 指出的那样。

In the above example the letters are sorted ascending "alphabetically" and descending by ascii code, which - in my case - results in "aAbBcCdDeEfFgG". The result in your case may be different.

在上面的示例中,字母按“字母顺序”升序排序,并按 ascii 代码降序排序,在我的情况下,结果为“aAbBcCdDeEfFgG”。您的情况的结果可能会有所不同。

You may also sort the letters using nlssort- it would give you better control of the sorting order, as you would get independent of your database configuration.

您还可以使用nlssort对字母进行排序- 它可以让您更好地控制排序顺序,因为您将独立于您的数据库配置。

select listagg(letter) 
    within group (order by nlssort(letter, 'nls_sort=german') 
from
(
select regexp_substr('gfedcbaGFEDCBA', '.', level) as letter from dual
connect by regexp_substr('gfedcbaGFEDCBA', '.', level) is not null
);

The query above would give you also "aAbBcCdDeEfFgG", but if you changed "german" to "spanish", you would get "AaBbCcDdEeFfGg" instead.

上面的查询也会为您提供“aAbBcCdDeEfFgG”,但如果您将“德语”更改为“西班牙语”,您将得到“AaBbCcDdEeFfGg”。

回答by Cine

You should remember that there is no common agreement what "alphabetically" means. It all depends on which country it is, and who is looking at your data and what context it is in.

您应该记住,对于“按字母顺序”的含义没有达成共识。这完全取决于它是哪个国家/地区,谁在查看您的数据以及它处于什么环境中。

For instance in DK, there are a large number of different sortings of a,aa,b,c,?,?,?

例如在 DK 中,有大量不同的排序 a,aa,b,c,?,?,?

  • per the alphabet: a,aa,b,c,?,?,?
  • for some dictionary: a,aa,?,b,c,?,?
  • for other dictionaries: a,b,c,?,?,aa,?
  • per Microsoft standard: a,b,c,?,?,aa,?
  • 每个字母表:a,aa,b,c,?,?,?
  • 对于某些字典:a,aa,?,b,c,?,?
  • 对于其他词典:a,b,c,?,?,aa,?
  • 根据 Microsoft 标准:a,b,c,?,?,aa,?

check out http://www.siao2.com/2006/04/27/584439.aspxfor more info. Which also happens to be a great blog for issues as these.

查看http://www.siao2.com/2006/04/27/584439.aspx了解更多信息。对于这些问题,这也恰好是一个很棒的博客。

回答by Jeffrey Kemp

Assuming you don't mind having the characters returned 1 per row:

假设您不介意每行返回 1 个字符:

select substr(str, r, 1) X from (
select 'CAB' str,
       rownum r
from dual connect by level <= 4000
) where r <= length(str) order by X;

X
=
A
B
C

回答by AJPerez

For people using Oracle 10g, select listagg within groupwon't work. The accepted answer does work, but it generates every possible permutation of the input string, which results in terrible performance - my Oracle database struggles with an input string only 10 characters long.

对于使用 Oracle 10g 的人select listagg within group将不起作用。接受的答案确实有效,但它会生成输入字符串的所有可能排列,这会导致性能不佳 - 我的 Oracle 数据库在输入字符串只有 10 个字符时遇到了困难。

Here is another alternative working for Oracle 10g. It's similar to Jeffrey Kemp's answer, only the result isn't splitted into rows:

这是适用于 Oracle 10g 的另一种替代方法。它类似于Jeffrey Kemp's answer,只是结果没有分成几行:

select replace(wm_concat(ch), ',', '') from (
    select substr('CAB', level, 1) ch from dual
    connect by level <= length('CAB')
    order by ch
);
-- output: 'ABC'

wm_concatsimply concatenates records from different rows into a single string, using commas as separator (that's why we are also doing a replace later).

wm_concat简单地将来自不同行的记录连接成一个字符串,使用逗号作为分隔符(这就是我们稍后也进行替换的原因)。

Please note that, if your input string had commas, they will be lost. Also, wm_concatis an undocumented feature, and according to this answerit has been removed in Oracle 12c. Use it only if you're stuck with 10g and don't have a better option (such as listagg, if you can use 11g instead).

请注意,如果您的输入字符串有逗号,它们将会丢失。此外,wm_concat是一个未记录的功能,根据这个答案,它已在 Oracle 12c 中删除。仅当您坚持使用 10g 并且没有更好的选择时才使用它(例如listagg,如果您可以改用 11g)。