PostgreSQL 计算子字符串在文本中出现的次数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/25757194/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
PostgreSQL count number of times substring occurs in text
提问by Tony Duan
I'm writing a PostgreSQL function to count the number of times a particular text substring occurs in another piece of text. For example, calling count('foobarbaz', 'ba') should return 2.
我正在编写一个 PostgreSQL 函数来计算特定文本子字符串在另一段文本中出现的次数。例如,调用 count('foobarbaz', 'ba') 应该返回 2。
I understand that to test whether the substring occurs, I use a condition similar to the below:
我知道为了测试子字符串是否出现,我使用了类似于以下的条件:
WHERE 'foobarbaz' like '%ba%'
However, I need it to return 2 for the number of times 'ba' occurs. How can I proceed?
但是,我需要它为 'ba' 出现的次数返回 2。我该如何继续?
Thanks in advance for your help.
在此先感谢您的帮助。
回答by Mike T
How about use a regular expression:
如何使用正则表达式:
SELECT count(*)
FROM regexp_matches('foobarbaz', 'ba', 'g');
The 'g'
flag repeats multiple matches on a string (not just the first).
该'g'
标志在一个字符串上重复多次匹配(不仅仅是第一个)。
回答by Evan Carroll
I would highly suggest checking out this answer I posted to "How do you count the occurrences of an anchored string using PostgreSQL?". The chosen answer was shown to be massively slower than an adapted version of regexp_replace()
. The overhead of creating the rows, and the running the aggregate is just simply too high.
我强烈建议您查看我发布到“您如何使用 PostgreSQL 计算锚定字符串的出现次数?”的答案。. 所选择的答案被证明比regexp_replace()
. 创建行和运行聚合的开销实在是太高了。
The fastest way to do this is as follows...
最快的方法如下...
SELECT
(length(str) - length(replace(str, replacestr, '')) )::int
/ length(replacestr)
FROM ( VALUES
('foobarbaz', 'ba')
) AS t(str, replacestr);
Here we
在这里,我们
- Take the length of the string,
L1
- Subtract from
L1
the length of the string with all of the replacements removedL2
to getL3
the difference in string length. - Divide
L3
by the length of the replacement to get the occurrences
- 取字符串的长度,
L1
- 从
L1
去除所有替换的字符串长度中减去L2
以获得L3
字符串长度的差异。 - 除以
L3
替换的长度以获得出现次数
For comparison that's about five times fasterthan the method of using regexp_matches()
which looks like this.
为了比较,这比看起来像这样的使用方法快五倍regexp_matches()
。
SELECT count(*)
FROM ( VALUES
('foobarbaz', 'ba')
) AS t(str, replacestr)
CROSS JOIN LATERAL regexp_matches(str, replacestr, 'g');
回答by Andreas Dietrich
There is a
有一个
str_count( src, occurence )
function based on
功能基于
SELECT (length( str ) - length(replace( str, occurrence, '' ))) / length( occurence )
and a
和一个
str_countm( src, regexp )
based on the @MikeT-mentioned
基于该@MikeT-mentioned
SELECT count(*) FROM regexp_matches( str, regexp, 'g')
available here: postgres-utils
此处可用:postgres-utils
回答by atiruz
Try with:
尝试:
SELECT array_length (string_to_array ('1524215121518546516323203210856879', '1'), 1) - 1
--RESULT: 7