SQL 全文搜索与“LIKE”

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/478472/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-01 00:52:08  来源:igfitidea点击:

SQL full text search vs "LIKE"

sqldatabasefull-text-search

提问by Kevin Pang

Let's say I have a fairly simple app that lets users store information on DVDs they own (title, actors, year, description, etc.) and I want to allow users to search their collection by any of these fields (e.g. "Keanu Reeves" or "The Matrix" would be valid search queries).

假设我有一个相当简单的应用程序,它允许用户在他们拥有的 DVD 上存储信息(标题、演员、年份、描述等),并且我希望允许用户通过这些字段中的任何一个(例如“基努里维斯”)来搜索他们的收藏或“The Matrix”将是有效的搜索查询)。

What's the advantage of going with SQL full text search vs simply splitting the query up by spaces and doing a few "LIKE" clauses in the SQL statement? Does it simply perform better or will it actually return results that are more accurate?

使用 SQL 全文搜索与简单地按空格拆分查询并在 SQL 语句中执行一些“LIKE”子句相比有什么优势?它只是表现更好还是实际上会返回更准确的结果?

采纳答案by AnthonyWJones

Full text search is likely to be quicker since it will benefit from an index of words that it will use to look up the records, whereas using LIKE is going to need to full table scan.

全文搜索可能会更快,因为它将受益于用于查找记录的单词索引,而使用 LIKE 将需要进行全表扫描。

In some cases LIKE will more accurate since LIKE "%The%" AND LIKE "%Matrix" will pick out "The Matrix" but not "Matrix Reloaded" whereas full text search will ignore "The" and return both. That said both would likely have been a better result.

在某些情况下,LIKE 会更准确,因为 LIKE "%The%" AND LIKE "%Matrix" 会选择“The Matrix”而不是“Matrix Reloaded”,而全文搜索将忽略“The”并返回两者。也就是说,两者都可能是更好的结果。

回答by John Fisher

Full-text indexes (which are indexes) are much faster than using LIKE (which essentially examines each row every time). However, if you know the database will be small, there may not be a performance need to use full-text indexes. The only way to determine this is with some intelligent averaging and some testing based on that information.

全文索引(即索引)比使用 LIKE(本质上每次都检查每一行)要快得多。但是,如果您知道数据库会很小,则可能没有使用全文索引的性能需要。确定这一点的唯一方法是使用一些智能平均和基于该信息的一些测试。

Accuracy is a different question. Full-text indexing allows you to do several things (weighting, automatically matching eat/eats/eating, etc.) you couldn't possibly implement that in any sort of reasonable time-frame using LIKE. The real question is whether you need those features.

准确性是一个不同的问题。全文索引允许你做一些事情(加权、自动匹配吃/吃/吃等),你不可能在任何合理的时间范围内使用 LIKE 来实现它。真正的问题是您是否需要这些功能。

Without reading the full-text documentation's description of these features, you're really not going to know how you should proceed. So, read up!

如果不阅读对这些功能的全文文档的描述,您真的不知道应该如何进行。所以,阅读!

Also, some basic tests (insert a bunch of rows in a table, maybe with some sort of public dictionary as a source of words) will go a long way to helping you decide.

此外,一些基本测试(在表格中插入一堆行,可能使用某种公共词典作为词源)将大大帮助您做出决定。

回答by Chris

A full text search query is much faster. Especially when working which lots of data in various columns.

全文搜索查询要快得多。特别是在处理不同列中的大量数据时。

Additionally you will have language specific search support. E.g. german umlauts like "ü" in "über" will also be found when stored as "ueber". Also you can use synonyms where you can automatically expand search queries, or replace or substitute specific phrases.

此外,您还将获得特定于语言的搜索支持。例如,当存储为“ueber”时,也会发现像“über”中的“ü”这样的德语变音。您还可以使用同义词,您可以在其中自动扩展搜索查询,或者替换或替换特定短语。

In some cases LIKE will more accurate since LIKE "%The%" AND LIKE "%Matrix" will pick out "The Matrix" but not "Matrix Reloaded" whereas full text search will ignore "The" and return both. That said both would likely have been a better result.

在某些情况下,LIKE 会更准确,因为 LIKE "%The%" AND LIKE "%Matrix" 会选择“The Matrix”而不是“Matrix Reloaded”,而全文搜索将忽略“The”并返回两者。也就是说,两者都可能是更好的结果。

That is not correct. The full text search syntax lets you specify "how" you want to search. E.g. by using the CONTAINSstatement you can use exact term matching as well fuzzy matching, weights etc.

那是不正确的。全文搜索语法允许您指定要搜索的“方式”。例如,通过使用CONTAINS语句,您可以使用精确的术语匹配以及模糊匹配、权重等。

So if you have performance issues or would like to provide a more "Google-like" search experience, go for the full text search engine. It is also very easy to configure.

因此,如果您遇到性能问题或想提供更“类似 Google”的搜索体验,请使用全文搜索引擎。它也很容易配置。

回答by Rui Luz

Just a few notes:

只是一些注意事项:

  1. LIKE can use an Index Seek if you don't start your LIKE with %. Example: LIKE 'Santa M%' is good! LIKE '%Maria' is bad! and can cause a Table or Index Scan because this can't be indexed in the standard way.

  2. This is very important. Full-Text Indexes updates are Asynchronous. For instance, if you perform an INSERT on a table followed by a SELECT with Full-Text Search where you expect the new data to appear, you might not get the data immediatly. Based on your configuration, you may have to wait a few seconds or a day. Generally, Full-Text Indexes are populated when your system does not have many requests.

  1. 如果 LIKE 不以 % 开头,LIKE 可以使用索引查找。示例:LIKE 'Santa M%' 很好!LIKE '%Maria' 很糟糕!并且可能导致表或索引扫描,因为这无法以标准方式进行索引。

  2. 这是非常重要的。全文索引更新是异步的。例如,如果您对表执行 INSERT,然后执行带有全文搜索的 SELECT,您希望新数据出现在那里,您可能无法立即获得数据。根据您的配置,您可能需要等待几秒钟或一天。通常,当您的系统没有很多请求时,会填充全文索引。

回答by jwanagel

It will perform better, but unless you have a lot of data you won't notice that difference. A SQL full text search index lets you use operators that are more advanced then a simple "LIKE" operation, but if all you do is the equivalent of a LIKE operation against your full text index then your results will be the same.

它会表现得更好,但除非您有大量数据,否则您不会注意到这种差异。SQL 全文搜索索引允许您使用比简单的“LIKE”操作更高级的运算符,但如果您所做的一切都相当于对全文索引的 LIKE 操作,那么您的结果将是相同的。

回答by waney

Imagine if you will allow to enter notes/descriptions on DVDs. In this case it will be good to allow to search by descriptions. Full text search in this case will do better job.

想象一下,您是否允许在 DVD 上输入注释/说明。在这种情况下,最好允许按描述进行搜索。在这种情况下,全文搜索会做得更好。

回答by Mohammad Dayyan

To FullTextSearchin SQL Server as LIKE
First, You have to create a StopListand assign it to your table

首先FullTextSearch在 SQL Server 中LIKE
,您必须创建一个StopList并将其分配给您的表

CREATE FULLTEXT STOPLIST [MyStopList];
GO
ALTER FULLTEXT INDEX ON dbo.[MyTableName] SET STOPLIST [MyStopList]
GO

Second, use the following tSql script:

其次,使用以下 tSql 脚本:

SELECT * FROM dbo.[MyTableName] AS mt
WHERE CONTAINS((mt.ColumnName1,mt.ColumnName2,mt.ColumnName3), N'"*search text s*"')

回答by Qiulang

If you do not just search English word, say you search a Chinese word, then how your fts tokenizes words will make your search a big different, as I gave an example here https://stackoverflow.com/a/31396975/301513. But I don't know how sql server tokenizes Chinese words, does it do a good job for that?

如果你不只是搜索英文单词,比如说你搜索一个中文单词,那么你的 fts 标记单词的方式会让你的搜索有很大的不同,就像我在这里举了一个例子https://stackoverflow.com/a/31396975/301513。但是不知道sql server是怎么对中文词进行tokenize的,它做的好吗?

回答by benlumley

You may get slightly better results, or else at least have an easier implementation with full text indexing. But it depends on how you want it to work ...

您可能会得到稍微好一点的结果,或者至少可以更轻松地实现全文索引。但这取决于您希望它如何工作......

What I have in mind is that if you are searching for two words, with LIKE you have to then manually implement (for example) a method to weight those with both higher in the list. A fulltext index should do this for you, and allow you to influence the weightings too using relevant syntax.

我的想法是,如果您要搜索两个单词,使用 LIKE,您必须手动实现(例如)一种方法来对列表中两个都较高的单词进行加权。全文索引应该为您做到这一点,并允许您使用相关语法影响权重。