SQL 计算 PostgreSQL 中字符串中子字符串的出现次数
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/36376410/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Counting the number of occurrences of a substring within a string in PostgreSQL
提问by Franck Dernoncourt
How can I count the number of occurrences of a substring within a string in PostgreSQL?
如何计算 PostgreSQL 中字符串中子字符串的出现次数?
Example:
例子:
I have a table
我有一张桌子
CREATE TABLE test."user"
(
uid integer NOT NULL,
name text,
result integer,
CONSTRAINT pkey PRIMARY KEY (uid)
)
I want to write a query so that the result
contains column how many occurrences of the substring o
the column name
contains. For instance, if in one row, name
is hello world
, the column result
should contain 2
, since there are two o
in the string hello world
.
我想编写一个查询,以便result
包含列o
该列name
包含多少次出现的子字符串。例如,如果在一行中name
是hello world
,则该列result
应该包含2
,因为o
字符串中有两个hello world
。
In other words, I'm trying to write a query that would take as input:
换句话说,我正在尝试编写一个作为输入的查询:
and update the result
column:
并更新result
列:
I am aware of the function regexp_matches
and its g
option, which indicates that the full (g
= global) string needs to be scanned for the presence of all occurrences of the substring).
我知道该函数regexp_matches
及其g
选项,这表明g
需要扫描完整(= 全局)字符串以查找所有出现的子字符串的存在。
Example:
例子:
SELECT * FROM regexp_matches('hello world', 'o', 'g');
returns
回报
{o}
{o}
and
和
SELECT COUNT(*) FROM regexp_matches('hello world', 'o', 'g');
returns
回报
2
But I don't see how to write an UPDATE
query that would update the result
column in such a way that it would contain how many occurrences of the substring o the column name
contains.
但是我不知道如何编写一个UPDATE
查询来更新result
列,以便它包含列name
包含的子字符串的出现次数。
回答by dnoeth
A common solution is based on this logic: replace the search string with an empty string and divide the difference between old and new length by the length of the search string
一个常见的解决方案是基于这样的逻辑:将搜索字符串替换为空字符串,并将新旧长度的差除以搜索字符串的长度
(CHAR_LENGTH(name) - CHAR_LENGTH(REPLACE(name, 'substring', '')))
/ CHAR_LENGTH('substring')
Hence:
因此:
UPDATE test."user"
SET result =
(CHAR_LENGTH(name) - CHAR_LENGTH(REPLACE(name, 'o', '')))
/ CHAR_LENGTH('o');
回答by Gordon Linoff
A Postgres'y way of doing this converts the string to an array and counts the length of the array (and then subtracts 1):
Postgres'y 这样做的方法是将字符串转换为数组并计算数组的长度(然后减去 1):
select array_length(string_to_array(name, 'o'), 1) - 1
Note that this works with longer substrings as well.
请注意,这也适用于更长的子字符串。
Hence:
因此:
update test."user"
set result = array_length(string_to_array(name, 'o'), 1) - 1;
回答by bnson
Other way:
另一种方式:
UPDATE test."user" SET result = length(regexp_replace(name, '[^o]', '', 'g'));
回答by Robert Bondy
Occcurence_Count = LENGTH(REPLACE(string_to_search,string_to_find,'~'))-LENGTH(REPLACE(string_to_search,string_to_find,''))
This solution is a bit cleaner than many that I have seen, especially with no divisor.
You can turn this into a function or use within a Select.
No variables required.
I use tilde as a replacement character, but any character that is not in the dataset will work.
这个解决方案比我见过的许多解决方案更清晰,尤其是没有除数。您可以将其转换为函数或在 Select 中使用。
不需要变量。我使用波浪号作为替换字符,但任何不在数据集中的字符都可以使用。
回答by Guilherme Passos
Return count of character,
返回字符数,
SELECT (LENGTH('1.1.1.1') - LENGTH(REPLACE('1.1.1.1','.',''))) AS count
--RETURN COUNT OF CHARACTER '.'