postgresql Postgres -- 从字符串中删除一组字母

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/17691579/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-11 00:17:40  来源:igfitidea点击:

Postgres -- removing a set of letters from a string

postgresql

提问by user2589918

I want to remove the vowels from the email id. Which function should I use? I am trying to find the difference between translateand replacein postgresql but didn't get the exact difference

我想从电子邮件 ID 中删除元音。我应该使用哪个功能?我试图找到postgresql 中translate和之间的区别,replace但没有得到确切的区别

回答by Bohemian

translate()replaces a setof single characters (passed as a string) with another setof characters (also passed as a string), for example:

translate()用另一字符(也作为字符串传递)替换一单个字符(作为字符串传递),例如:

translate('abcdef', 'ace', 'XYZ') --> 'XbYdZf'

replace()replaces occurrences of a stringof arbitrary length with another string:

replace()取代了的出现与其他的任意长度的字符串

replace('abcdef', 'bc', 'FOO') --> 'aFOOdef'

回答by Craig Ringer

In this case you probably actually want regexp_replace.

在这种情况下,您可能实际上想要regexp_replace.

Assuming by "vowel" you mean "Western European (English) language vowel letters" you might write:

假设“元音”是指“西欧(英语)语言元音字母”,您可能会写:

SELECT regexp_replace('[email protected]', '[aeiou]', '', 'gi');

the giin the fourth argument says "apply this regular expression globally to the whole input string not just to the first match, and make it case insensitive".

gi第四参数写着“应用此正则表达式lobally整个输入字符串不只是第一场比赛,并使其区分nsensitive”。

Remember that wand yare sometimes vowel-sounds, depending on their context, too. You won't be able to handle that with a regexp so it depends on whether or not you care for this purpose.

请记住,w并且y有时是元音的声音,这取决于他们的背景下,太。您将无法使用正则表达式处理该问题,因此这取决于您是否关心此目的。

You're less likely to need to deal with other character sets if you're working with email addresses so a regexp might be OK for this.

如果您使用电子邮件地址,则不太可能需要处理其他字符集,因此正则表达式可能适用于此。

In most cases mangling words with regular expressions would not be a good approach, though; for example, Russian in the Cyrillic alphabet uses A Э У О Ы Я Е Ё Ю Иas vowels. Additionally, depending on the language, the same letter in the same script might or might not be a vowel! Keep reading here for more than you ever wanted to know.

不过,在大多数情况下,使用正则表达式处理单词并不是一个好方法。例如,西里尔字母中的俄语A Э У О Ы Я Е Ё Ю И用作元音。此外,根据语言的不同,同一个脚本中的同一个字母可能是也可能不是元音!继续在这里阅读您想知道的更多信息

回答by LeoRochael

TLDR;

TLDR;

To eliminate all vowels from an "e-mail id", the simplest expression I can think of is:

要从“电子邮件 ID”中消除所有元音,我能想到的最简单的表达方式是:

  • translate(email_id, 'aeiou', '')
  • translate(email_id, 'aeiou', '')

In detail

详细

To complement jerzy's answer

补充jerzy的回答

Both replace()and translate()can be used to:

无论replace()translate()可用于:

  • replace characters
  • eliminate characters
  • 替换字符
  • 消除字符

And both functions accept three parameters:

这两个函数都接受三个参数:

  • A string being manipulated: the return string will be a (perhaps) modified version of this string
  • A fromstring, containing something to be found in the manipulatedstring
  • A tostring, containing something that might be present on the output string depending on the function used, the manipulatedstring, and the fromparameter
  • 一个字符串manipulated:返回字符串将是此字符串的(可能)修改版本
  • 一个from字符串,包含要在manipulated字符串中找到的内容
  • 一个to字符串,根据所使用的函数、manipulated字符串和from参数,包含可能出现在输出字符串中的内容

The difference is that replace()can only replace whole strings of characters, that must be found in the manipulatedstring in a specific order:

不同之处在于replace()只能替换整个字符串,必须按manipulated特定顺序在字符串中找到:

postgres=> select replace('foobarbaz', 'bar', 'FRED');
  replace   
------------
 fooFREDbaz
(1 row)

Even if replacing them with an empty string:

即使用空字符串替换它们:

postgres=> select replace('foobarbaz', 'bar', '');
 replace 
---------
 foobaz
(1 row)

But if the characters in the fromstring cannot be found in that specific order inside the manipulatedstring, replace()returns a string identical to the manipulatedone:

但是,如果在from字符串中找不到以特定顺序显示的manipulated字符串中的字符,则replace()返回与该字符串相同的字符串manipulated

postgres=> select replace('foobarbaz', 'rab', '');
 replace 
---------
 foobarbaz
(1 row)

translate()on the other hand, deals not with strings of chars that must be found in a specific sequence on the string being manipulated, but with sets of characters:

translate()另一方面,处理的不是必须在被操作的字符串上的特定序列中找到的字符字符串,而是处理字符集:

Each character in the manipulatedstring that is present in the fromstring is mapped to a character in the same position in the tostring as it was found in the fromstring:

出现在from字符串中的操作字符串中的每个字符都映射到to字符串中与在from字符串中找到的位置相同的字符:

postgres=> select translate('foobarbaz', 'bar', '123');
 translate 
-----------
 foo12312z

postgres=> select translate('foobarbaz', 'rab', '123');
 translate 
-----------
 foo32132z
(1 row)

In the first example above, the following mapping happened:

在上面的第一个示例中,发生了以下映射:

  • 'b'-> '1'(for both occurrences of 'b')
  • 'a'-> '2'(for both occurrences of 'a')
  • 'r'-> '3'(for the single occurrence '3')
  • 'b'-> '1'(对于 的两次出现'b'
  • 'a'-> '2'(对于 的两次出现'a'
  • 'r'-> '3'(对于单次出现'3'

Although translate()can be used to map characters as above, it can also be used to eliminate sets of characters. This happens if the tostring is shorter than the fromstring:

虽然translate()可用于如上映射字符,但也可用于消除字符集。如果to字符串比from字符串短,则会发生这种情况:

postgres=> select translate('foobarbaz', 'rab', '1');
 translate 
-----------
 foo1z
(1 row)

postgres=> select translate('foobarbaz', 'rab', '');
 translate 
-----------
 fooz
(1 row)

In the first example above, the following mapping happened:

在上面的第一个示例中,发生了以下映射:

  • 'r'-> '1'(for the single occurrence '3')
  • 'b'-> eliminated (for both occurrences of 'b')
  • 'a'-> eliminated (for both occurrences of 'a')
  • 'r'-> '1'(对于单次出现'3'
  • 'b'-> 消除(对于 的两次出现'b'
  • 'a'-> 消除(对于 的两次出现'a'

Whereas in the second example above, all occurrences of the characters 'r', 'a'and 'b'are eliminated since the tostring is empty.

而在上面的第二个示例中,所有出现的字符'r','a''b'都被消除,因为to字符串为空。

So, to eliminate vowels from an email id, you can do:

因此,要从电子邮件 ID 中消除元音,您可以执行以下操作:

  • translate(email_id, 'aeiou', '')
  • translate(email_id, 'aeiou', '')

As long as all you care are about asciivowels, as mentioned by Craig Ringer in his answer.

只要你关心的是ascii元音,正如克雷格林格在他的回答中提到的那样。