oracle Oracle正则表达式在名称字段中查找特殊字符

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/35922404/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-19 03:11:04  来源:igfitidea点击:

Oracle regex to find the special character in name field

regexoracle

提问by Vignesh

I'm trying to filter out the names which have special characters.

我正在尝试过滤掉具有特殊字符的名称。

Requirement:

要求:

1) Filter the names which have characters other thana-zA-Z , space and forward slash(/).

1) 过滤a-zA-Z , space and forward slash(/).

Regex being tried out:

正在试用的正则表达式:

1) regexp_like (customername,'[^a-zA-Z[:space:]\/]'))
2) regexp_like (customername,'[^a-zA-Z \/]'))

The above two regex helps in finding the names with special characters like ? and dot(.)

以上两个正则表达式有助于查找带有特殊字符的名称,例如 ? and dot(.)

For example:

例如:

LEAL/JO?O

FRANCO/DIVALDO Sr.

But I couldn't figure out why some names(listed below) with the allowed characters(a-zA-Z , space and forward slash(/)) also get retrieved.

但我不明白为什么有些名字(如下所列) with the allowed characters(a-zA-Z , space and forward slash(/)) also get retrieved.

For example:

例如:

ESTEVES/MARIA INES

PEREZ/JOSE

DUTRA SILVA/LIGIA

Please help to figure out the mistake in the regex being used.

请帮助找出正在使用的正则表达式中的错误。

Many thanks in advance!

提前谢谢了!

回答by Gary_W

Your regex #1 worked for me on 11g with the name data copied/pasted from this page. I wonder if you have non-printable control characters in the data? Try adding [:cntrl:]to the regex to catch control characters. P.S. the backslash is not needed before the slash when inside of a character class (square brackets).

您的正则表达式 #1 在 11g 上对我来说有效,名称数据从此页面复制/粘贴。我想知道你的数据中是否有不可打印的控制字符?尝试添加[:cntrl:]到正则表达式以捕获控制字符。PS 在字符类(方括号)内时,在斜杠之前不需要反斜杠。

SQL> with tbl(name) as (
      select 'LEAL/JO?O'          from dual union
      select 'FRANCO/DIVALDO Sr.' from dual union
      select 'ESTEVES/MARIA INES' from dual union
      select 'PEREZ/JOSE'         from dual union
      select 'DUTRA SILVA/LIGIA'  from dual
    )
    select *
    from tbl
    where regexp_like(name, '[^a-zA-Z[:space:][:cntrl:]/]');

NAME
------------------
FRANCO/DIVALDO Sr.
LEAL/JO?O

SQL>

If you can copy/paste this, run it and get the same results, then something is up with the data in your table. Have a look at the data in HEX which will bring to light a previously hidden character perhaps. Here's a simple example which shows the name "JOSE" in HEX. Using one of the numerous ASCII charts out there like http://www.asciitable.com/you can see there are no hidden characters:

如果您可以复制/粘贴它,运行它并获得相同的结果,那么表中的数据就会出现问题。看看 HEX 中的数据,这可能会揭示以前隐藏的字符。这是一个简单的示例,它显示了十六进制中的名称“JOSE”。使用像http://www.asciitable.com/这样的众多 ASCII 图表之一,您可以看到没有隐藏字符:

SQL> select 'JOSE' as chr, rawtohex('JOSE') as hex from dual;

CHR  HEX
---- --------
JOSE 4A4F5345

SQL>

So, have a look at a name or two and see if you have any hidden characters. If not, I suspect a conflicting characterset issue maybe.

所以,看看一两个名字,看看你是否有任何隐藏的字符。如果没有,我怀疑可能存在冲突的字符集问题。

回答by Beege

@gary_w has most of the bases well covered....

@gary_w 已经涵盖了大部分基础......

Here's my sql version of unix: cat -vet MyFile

这是我的 sql 版本的 unix: cat -vet MyFile

select replace(regexp_replace(my_column,'[^[:print:]]', '!ACK!'),' ','.') as CAT_VET
from my_table

... all the non-printing characters become !ACK!and spaces become .You still need to determine what the characters actually ARE, but it's useful to find the looney-toon characters in your data.

...所有非打印字符都变成了!ACK!空格变成了.您仍然需要确定字符实际上是什么,但是在数据中找到疯狂卡通字符很有用。

Also, select dump(my_column) ...is another way to view the raw column values.

此外,select dump(my_column) ...是另一种查看原始列值的方法。