正则表达式在文本 PostgreSQL 中只找到 6 位数字

Question

提问by CristisS

I need to build a query in PostgreSQL and am required to find all text entries that contain a 6 digit number (e.g. 000999, 019290, 998981, 234567, etc). The problem is that the number is not necessary at the begining of the string or at its end.

我需要建立在PostgreSQL的查询和我需要查找包含6位数字的所有文字条目（例如000999，019290，998981，234567，等）。问题是在字符串的开头或结尾不需要数字。

I tried and didn't work:

我试过但没有用：

[0-9]{6}- returns part of a number with more than 6 digits
(?:(?<!\d)\d{6}(?!\d))- postgresql does not know about lookbehind
[^0-9][0-9]{6}[^0-9]and variations on it, but to no avail.

[0-9]{6}- 返回超过 6 位数字的一部分
(?:(?<!\d)\d{6}(?!\d))- postgresql 不知道lookbehind
[^0-9][0-9]{6}[^0-9]和它的变化，但无济于事。

Building my own Perl/C function is not really an option as I do not have the skills required. Any idea what regexp could be used or other tricks that elude me at the moment?

构建我自己的 Perl/C 函数并不是一个真正的选择，因为我没有所需的技能。知道现在可以使用什么正则表达式或其他技巧吗？

EDIT

编辑

Input samples:

输入样本：

aa 0011527 /CASA-> should return NOTHING
aa 001152/CASA-> should return 001152
aa001152/CASA-> should return 001152
aa0011527/CASA-> should return NOTHING
aa001152 /CASA-> should return 001152

aa 0011527 /CASA-> 应该什么都不返回
aa 001152/CASA-> 应该返回 001152
aa001152/CASA-> 应该返回 001152
aa0011527/CASA-> 应该什么都不返回
aa001152 /CASA-> 应该返回 001152

Answer 1

回答by h2ooooooo

If PostgreSQL supports word boundaries, use \b:

如果 PostgreSQL 支持单词边界，请使用\b：

\b(\d{6})\b

Edit:

编辑：

\bin PostgreSQL means backspace, so it's not a word boundary.

\b在 PostgreSQL 中的意思是backspace，所以它不是一个词的边界。

http://www.postgresql.org/docs/8.3/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXPhowever, will explain you that you can use \yas a word boundary, as it means matches only at the beginning or end of a word, so

http://www.postgresql.org/docs/8.3/interactive/functions-matching.html#FUNCTIONS-POSIX-REGEXP但是，会向您解释您可以\y用作单词边界，因为它的意思是matches only at the beginning or end of a word，所以

\y(\d{6})\y

should work.

应该管用。

\m(\d{6})\M

shouldalso work.

也应该工作。

Full list of word matches in PostgreSQL regex:

PostgreSQL 正则表达式中单词匹配的完整列表：

Escape  Description
\A      matches only at the beginning of the string (see Section 9.7.3.5 for how this differs from ^)
\m      matches only at the beginning of a word
\M      matches only at the end of a word
\y      matches only at the beginning or end of a word
\Y      matches only at a point that is not the beginning or end of a word
\Z      matches only at the end of the string (see Section 9.7.3.5 for how this differs from $)

New edit:

新编辑：

Based on your edit, you should be able to do this:

根据您的编辑，您应该能够做到这一点：

(^|[^\d])(\d+)([^\d]|$)

Answer 2

回答by CristisS

Using what @h2ooooooo proposed I managed to create the following query:

使用@h2oooooooo 的建议，我设法创建了以下查询：

SELECT cleantwo."ID",cleantwo."Name",cleantwo."Code"
FROM
(
SELECT cleanone."ID",cleanone."Name",unnest(cleanone."Code") as "Code" -- 3. unnest all the entries received using regexp_matches (get all combinations)
FROM 
(
SELECT sd."ID", sd."Name", regexp_matches(sd."Name", '(^|[^\d])(\d+)([^\d]|$)')
    as "Code"
FROM "T_SOME_DATA" sd
WHERE substring(sd."Name" from 1 for 15) ~('(^|[^\d])(\d+)([^\d]|$)') -- 1. get all data possible
) as cleanone
WHERE cleanone."Code" IS NOT NULL -- 2. get data where code IS NOT NULL (remove useless entries)
) as cleantwo
WHERE length(cleantwo."Code")=6 -- 4. get only the combinations relevant to my initial requirement (codes with length 6)<br/>

It took me a lot of time to find this so I hope it helps someone else in the same situation. Good luck!

我花了很多时间才找到这个，所以我希望它可以帮助处于相同情况的其他人。祝你好运！

正则表达式在文本 PostgreSQL 中只找到 6 位数字

提问by CristisS

回答by h2ooooooo

回答by CristisS

相关推荐

最近更新

标签

正则表达式在文本 PostgreSQL 中只找到 6 位数字

提问by CristisS

回答by h2ooooooo

回答by CristisS

相关推荐

postgresql 在centos上安装pdo_pgsql

Postgresql - 日期比较

postgresql plpgsql 在表返回函数中出现错误“RETURN NEXT 不能在带有 OUT 参数的函数中包含参数”

postgresql 如何通过java读取pg_xlog目录下的WAL文件

相关推荐

最近更新

标签