MySQL SQL 计数字段中的单词数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/27827376/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 13:15:32  来源:igfitidea点击:

SQL count number of words in field

mysqlsqlsql-server-2008postgresql

提问by Bob

I'd like to make an SQL query where the condition is that column1 contains three or more words. Is there something to do that?

我想做一个 SQL 查询,条件是 column1 包含三个或更多单词。有什么可以做的吗?

回答by David Chan

maybe try counting spaces ?

也许尝试计算空格?

SELECT * 
FROM table
WHERE (LENGTH(column1) - LENGTH(replace(column1, ' ', ''))) > 1

and assume words is number of spaces + 1

并假设单词是空格数 + 1

回答by Gordon Linoff

If you want a condition that a column contains three or more words andyou want it to work in a bunch of databases andwe assume that words are separated by single spaces, then you can use like:

如果您想要一列包含三个或更多单词的条件,并且您希望它在一堆数据库中工作,并且我们假设单词由单个空格分隔,那么您可以使用like

where column1 like '% % %'

回答by a_horse_with_no_name

In Postgres you can use regexp_split_to_array()for this:

在 Postgres 中,您可以使用regexp_split_to_array()

select *
from the_table
where array_length(regexp_split_to_array(the_column, '\s+'), 1) >= 3;

This will split the contents of the column the_columninto array elements. One ore more whitespace are used as the delimiter. It won't respect "quoted" spaces though. The value 'one "two three" four'will be counted as four words.

这会将列的内容拆分the_column为数组元素。一个或多个空格用作分隔符。不过,它不会尊重“引用”的空格。该值'one "two three" four'将计为四个字。

回答by Evan de la Cruz

The best way to do this, is to NOT do this.

做到这一点的最好方法是不要这样做。

Instead, you should use the application layer to count the words during INSERT and save the word count into its own column.

相反,您应该使用应用程序层在 INSERT 期间计算字数并将字数保存到其自己的列中。

While I like, and upvoted, some of the answers here, all of them will be very slow and not 100% accurate.

虽然我喜欢并赞成这里的一些答案,但所有答案都非常慢,而且不是 100% 准确。

I know people want a simple answer to SELECT the word count, but it just is NOT POSSIBLE with accuracy and speed.

我知道人们想要一个简单的答案来选择字数,但这在准确性和速度上是不可能的。

If you want it to be 100% accurate, and very fast, then use this solution.

如果您希望它 100% 准确且速度非常快,请使用此解决方案。

Steps to solve:

解决步骤:

  1. Add a column to your table and index it: ALTER TABLE tablename ADD COLUMN wordcount INT UNSIGNED NULL, ADD INDEX idxtablename_count (wordcount ASC);.
  2. Before doing your INSERT, count the number of words using your application. For example in PHP: $count = str_word_count($somevalue);
  3. During the INSERT, include the value of $countfor the column wordcountlike insert into tablename (col1, col2, col3, wordcount) values (val1, val2, val3, $count);
  1. 向表中添加一列并为其编制索引:ALTER TABLE tablename ADD COLUMN wordcount INT UNSIGNED NULL, ADD INDEX idxtablename_count (wordcount ASC);
  2. 在进行 INSERT 之前,使用您的应用程序计算单词数。例如在 PHP 中:$count = str_word_count($somevalue);
  3. 在插入期间,包括$count列的值,wordcountinsert into tablename (col1, col2, col3, wordcount) values (val1, val2, val3, $count);

Then your select statement becomes super easy, clean, uber-fast, and 100% accurate.

然后你的 select 语句变得超级简单、干净、超级快速和 100% 准确。

select * from tablename where wordcount >= 3;

select * from tablename where wordcount >= 3;

Also remember when you are updating any rows that you will need to recount the words for that column.

还要记住,当您更新任何需要重新计算该列单词的行时。

回答by Trideep Rath

For "n" or more words

对于“n”个或更多单词

select *
from table
where (length(column)- length(replace(column, " ", "")) + 1) >= n

PS: This would not work if words have multiple spaces between them.

PS:如果单词之间有多个空格,这将不起作用。

回答by Madhivanan

To handle multiple spaces too, use the method shown here

要处理多个空格,请使用此处显示的方法

Declare @s varchar(100)
set @s='  See      how many                        words this      has  '
set @s=ltrim(rtrim(@s))

while charindex('  ',@s)>0
Begin
    set @s=replace(@s,'  ',' ')
end

select len(@s)-len(replace(@s,' ',''))+1 as word_count

https://exploresql.com/2018/07/31/how-to-count-number-of-words-in-a-sentence/

https://exploresql.com/2018/07/31/how-to-count-number-of-words-in-a-sentence/

回答by AmericanTourist

I think David nailed it above. However, as a more complete answer:

我认为大卫把它钉在了上面。但是,作为更完整的答案:

LENGTH(RTRIM(LTRIM(REPLACE(column1,'  ', ' ')))) - LENGTH(REPLACE(RTRIM(LTRIM(REPLACE(column1, '  ', ' '))), ' ', '')) + 1 AS number_of_words

This will remove double spaces, as well as leading and trailing spaces in your string.

这将删除字符串中的双空格以及前导和尾随空格。

Of course, you may go further by adding replacements for more than 2 spaces in a row...

当然,您可以通过连续添加超过 2 个空格的替换来进一步...

回答by Shahar

This can work:

这可以工作:

SUM(LENGTH(a) - LENGTH(REPLACE(a, ' ', '')) + 1)

Where ais the string column. It will count the number of spaces, which is 1 less than the number of words.

a字符串列在哪里。它将计算空格数,比单词数少 1。