在 MySQL 中搜索“全字匹配”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/656951/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 12:56:39  来源:igfitidea点击:

Search for "whole word match" in MySQL

mysqlsqlfull-text-search

提问by Oddthinking

I would like to write an SQL query that searches for a keyword in a text field, but only if it is a "whole word match" (e.g. when I search for "rid", it should not match "arid", but it should match "a rid".

我想编写一个在文本字段中搜索关键字的 SQL 查询,但前提是它是“全字匹配”(例如,当我搜索“rid”时,它不应该匹配“arid”,但它应该匹配“摆脱”。

I am using MySQL.

我正在使用 MySQL。

Fortunately, performance is not critical in this application, and the database size and string size are both comfortably small, but I would prefer to do it in the SQL than in the PHP driving it.

幸运的是,在这个应用程序中,性能并不重要,而且数据库大小和字符串大小都非常小,但我更愿意在 SQL 中进行,而不是在 PHP 驱动中进行。

回答by LukeH

You can use REGEXPand the [[:<:]]and [[:>:]]word-boundary markers:

您可以使用REGEXP[[:<:]][[:>:]]字边界标记:

SELECT *
FROM table 
WHERE keywords REGEXP '[[:<:]]rid[[:>:]]'

Update for 2020: (actually 2018+)

2020 年更新:(实际上是 2018 年以上)

MySQL updated its RegExp-Engine in version 8.0.4, so you will now need to usethe "standard" word boundary marker \b:

MySQL 在 8.0.4 版本中更新了它的 RegExp-Engine,所以你现在需要使用标准”字边界标记 \b:

SELECT *
FROM table 
WHERE keywords REGEXP '\brid\b'

Also be aware that you need to escape the backslash by putting a second backslash.

另请注意,您需要通过放置第二个反斜杠来转义反斜杠。

回答by Ricky Boyce

Found an answer to prevent the classic word boundary [[::<::]]clashing with special characters eg .@#$%^&*

找到了一个防止经典词边界[[::<::]]与特殊字符冲突的答案,例如 .@#$%^&*

Replace..

代替..

SELECT *
FROM table 
WHERE keywords REGEXP '[[:<:]]rid[[:>:]]'

With this..

有了这个..

SELECT *
FROM table 
WHERE keywords REGEXP '([[:blank:][:punct:]]|^)rid([[:blank:][:punct:]]|$)'

The latter matches (space, tab, etc) || (comma, bracket etc) || start/end of line. A more 'finished' word boundary match.

后者匹配(空格、制表符等) || (逗号、括号等) || 行的开始/结束。更“完成”的单词边界匹配。

回答by paxdiablo

You can use likewith the wildcard marker to catch the possibilities (at start, at end, in middle, and alone), something like this should suffice:

您可以使用like通配符来捕捉可能性(在开始、结束、中间和单独),这样的事情就足够了:

select blah blah blah where column like 'rid %' or column like '% rid' or column like '% rid %' or column = 'rid'

选择 blah blah blah where 像 'rid %' 的列或像 '%rid' 的列或像 '%rid %' 的列或 column = 'rid'

回答by Marco Marsala

Use regexp with word boundaries, but if you want also accent insensitive search, please note that REGEXP is a single-byte operator, so it is Worth nothing to have utf8_general_ci collation, the match will not be accent insensitive.

使用带有单词边界的 regexp,但如果您还想要重音不敏感搜索,请注意 REGEXP 是一个单字节运算符,因此使用 utf8_general_ci 排序规则毫无价值,匹配将不会是重音不敏感的。

To have both accent insensitive and whole word match, specify the word written in the same way the (deprecated) PHP function sql_regcase() did.

要使重音不敏感和整个单词匹配,请指定以(不推荐使用的)PHP 函数 sql_regcase() 相同的方式编写的单词。

In fact:

实际上:

  • utf8_general_ci allows you to make an equality (WHERE field = value) case and accent insensitive search but it doesn't allow you to specify an entire word match (word boundaries markers not recognized)

  • LIKE allows you case and accent insensitive search but you have to manually specify all combinations of possible word boundaries charactes (word boundaries markers not recognized)

  • word boundaries [[:<:]] and [[:>:]] are supported in REGEXP, who is a single byte functions so don't perform accent insensitive search.

  • utf8_general_ci 允许您进行相等(WHERE 字段 = 值)大小写和重音不敏感搜索,但它不允许您指定整个单词匹配(无法识别单词边界标记)

  • LIKE 允许您不区分大小写和重音的搜索,但您必须手动指定可能的单词边界字符的所有组合(无法识别单词边界标记)

  • REGEXP 支持单词边界 [[:<:]] 和 [[:>:]],它们是单字节函数,因此不要执行不区分重音的搜索。

The solution is to use REGEXP with word boundaries and the word modified in the way sql_regcase does.

解决方案是使用带有单词边界的 REGEXP,并且按照 sql_regcase 的方式修改单词。

Used on http://www.nonsolodiete.it

http://www.nonsolodiete.it 上使用

回答by MrTelly

select * from table where Locate('rid ', FieldToSearch) > 0 
      or Locate(' rid', FieldToSearch) > 0

This will handle finding rid where it is preceded or followed by a space, you could extend the approach to take account of .,?! and so on, not elegant but easy.

这将处理在空格之前或之后的地方找到摆脱,您可以扩展该方法以考虑 .,?! 等等,不优雅但容易。

回答by Oddthinking

This is the best answer I've come up myself with so far:

这是迄今为止我自己想出的最佳答案:

SELECT * FROM table 
WHERE keywords REGEXP '^rid[ $]' OR keywords REGEXP ' rid[ $]'

I would have simplified it to:

我会把它简化为:

SELECT *
FROM table
WHERE keywords REGEXP '[^ ]rid[ $]'

but [^ ] has a special meaning of "NOT a space", rather than "line-beginning or space".

但是 [^ ] 具有“不是空格”的特殊含义,而不是“行首或空格”。

How does REGEXP compare to multiple LIKE conditions? (Not that performance matters in this app.)

REGEXP 与多个 LIKE 条件相比如何?(在这个应用程序中,性能并不重要。)