PostgreSQL 计算子字符串在文本中出现的次数

Question

提问by Tony Duan

I'm writing a PostgreSQL function to count the number of times a particular text substring occurs in another piece of text. For example, calling count('foobarbaz', 'ba') should return 2.

我正在编写一个 PostgreSQL 函数来计算特定文本子字符串在另一段文本中出现的次数。例如，调用 count('foobarbaz', 'ba') 应该返回 2。

I understand that to test whether the substring occurs, I use a condition similar to the below:

我知道为了测试子字符串是否出现，我使用了类似于以下的条件：

    WHERE 'foobarbaz' like '%ba%'

However, I need it to return 2 for the number of times 'ba' occurs. How can I proceed?

但是，我需要它为 'ba' 出现的次数返回 2。我该如何继续？

Thanks in advance for your help.

在此先感谢您的帮助。

Answer 1

回答by Mike T

How about use a regular expression:

如何使用正则表达式：

SELECT count(*)
FROM regexp_matches('foobarbaz', 'ba', 'g');

The 'g'flag repeats multiple matches on a string (not just the first).

该'g'标志在一个字符串上重复多次匹配（不仅仅是第一个）。

Answer 2

回答by Evan Carroll

I would highly suggest checking out this answer I posted to "How do you count the occurrences of an anchored string using PostgreSQL?". The chosen answer was shown to be massively slower than an adapted version of regexp_replace(). The overhead of creating the rows, and the running the aggregate is just simply too high.

我强烈建议您查看我发布到“您如何使用 PostgreSQL 计算锚定字符串的出现次数？”的答案。. 所选择的答案被证明比regexp_replace(). 创建行和运行聚合的开销实在是太高了。

The fastest way to do this is as follows...

最快的方法如下...

SELECT
  (length(str) - length(replace(str, replacestr, '')) )::int
  / length(replacestr)
FROM ( VALUES
  ('foobarbaz', 'ba')
) AS t(str, replacestr);

Here we

在这里，我们

Take the length of the string, L1
Subtract from L1the length of the string with all of the replacements removed L2to get L3the difference in string length.
Divide L3by the length of the replacement to get the occurrences

取字符串的长度， L1
从L1去除所有替换的字符串长度中减去L2以获得L3字符串长度的差异。
除以L3替换的长度以获得出现次数

For comparison that's about five times fasterthan the method of using regexp_matches()which looks like this.

为了比较，这比看起来像这样的使用方法快五倍regexp_matches()。

SELECT count(*)
FROM ( VALUES
  ('foobarbaz', 'ba')
) AS t(str, replacestr)
CROSS JOIN LATERAL regexp_matches(str, replacestr, 'g');

Answer 3

回答by Andreas Dietrich

There is a

有一个

str_count( src,  occurence )

function based on

功能基于

SELECT (length( str ) - length(replace( str, occurrence, '' ))) / length( occurence )

and a

和一个

str_countm( src, regexp )

based on the @MikeT-mentioned

基于该@MikeT-mentioned

SELECT count(*) FROM regexp_matches( str, regexp, 'g')

available here: postgres-utils

此处可用：postgres-utils

Answer 4

回答by atiruz

Try with:

尝试：

SELECT array_length (string_to_array ('1524215121518546516323203210856879', '1'), 1) - 1

--RESULT: 7

PostgreSQL 计算子字符串在文本中出现的次数

提问by Tony Duan

回答by Mike T

回答by Evan Carroll

回答by Andreas Dietrich

回答by atiruz

相关推荐

最近更新

标签

PostgreSQL 计算子字符串在文本中出现的次数

提问by Tony Duan

回答by Mike T

回答by Evan Carroll

回答by Andreas Dietrich

回答by atiruz

相关推荐

postgresql 如何使用 GROUP BY 子句将正确的属性名称设置为 json 聚合结果？

brew install postgresql (upgrade) 错误，无法链接 - 死链接到旧的不存在的版本

postgresql 如何杀死不会死的postgres进程？

在 Centos 7 上设置 Postgresql-93

相关推荐

最近更新

标签