SQL 在 Oracle 中不是 REGEXP_LIKE

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/42536101/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 05:09:26  来源:igfitidea点击:

Not REGEXP_LIKE in Oracle

sqlregexoracle

提问by SAR622

I have a large table with phone numbers. The phone numbers are all strings and supposed to be '+9628789878' or similar. (a "+" sign followed by between 9 and 13 digits.)

我有一张大桌子,上面有电话号码。电话号码都是字符串,应该是“+9628789878”或类似的。(“+”号后跟 9 到 13 位数字。)

A user bug uncovered one row with the string '+987+9873678298'. Clearly it shouldn't be there and I'd like to find out how many other cases there are of this or other such errors.

一个用户错误发现一行带有字符串“+987+9873678298”。显然它不应该在那里,我想知道还有多少其他情况存在这种或其他此类错误。

I tried this query but it's not doing the job. My thinking is anything that's not like this string. (Oh, the table is not indexed by phone_number.)

我试过这个查询,但它没有完成这项工作。我的想法是任何与此字符串不同的东西。(哦,该表不是由 phone_number 索引的。)

SELECT user_key,
       first_name,
       last_name,
       phone_number
FROM   users u
WHERE  regexp_like(phone_number, '[^\+[0-9]*]')
AND    phone_number IS NOT NULL

回答by Aleksej

If you need to find all the rows where phone_numberis not made by exactly a '+'followed by 9-13 digits, this should do the work:

如果您需要查找phone_number不是由精确 a'+'后跟 9-13 位数字组成的所有行,这应该可以完成工作:

select *
from users 
where not regexp_like(phone_number, '^\+[0-9]{9,13}$')

What it does:

它能做什么:

  • ^the beginning of the string, to avoid things like 'XX +123456789'
  • \+the '+'
  • [0-9]{9,13}a sequence of 9-13 digits
  • $the end of the string, to avoid strings like '+123456789 XX'
  • ^字符串的开头,以避免诸如 'XX +123456789'
  • \+'+'
  • [0-9]{9,13}9-13 位数字的序列
  • $字符串的结尾,以避免像这样的字符串 '+123456789 XX'

Another way, with no regexp, could be the following:

另一种没有正则表达式的方法可能如下:

where not (
                /* strings of 10-14 chars */
                length(phone_number) between 10 and 14 
                /* ... whose first is a + */
            and substr(phone_number, 1, 1 ) = '+' 
                /* ...and that become a '+' after removing all the digits */
            and nvl(translate(phone_number, 'X0123456789', 'X'), '+') = '+' 
          )

This could be faster than the regexp approach, even if it's based on more conditions, but I believe only a test will tell you which one is the best performing.

这可能比 regexp 方法更快,即使它基于更多条件,但我相信只有测试才能告诉您哪个性能最好。