MySQL 搜索引擎 Lucene 与数据库搜索

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/4638671/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 18:15:23  来源:igfitidea点击:

Search engine Lucene vs Database search

mysqllucenesearch-engine

提问by Santosh Linkha

I am using a MySQL database and have been using database driven search. Any advantages and disadvantages of database engines and Lucene search engine? I would like to have suggestions about when and where to use them?

我正在使用 MySQL 数据库并且一直在使用数据库驱动的搜索。数据库引擎和Lucene搜索引擎有什么优缺点?我想就何时何地使用它们提出建议?

采纳答案by Yuval F

I suggest you read Full Text Search Engines vs. DBMS. A one-liner would be: If the bulk of your use case is full text search, use Lucene. If the bulk of your use case is joins and other relational operations, use a database. You may use a hybrid solution for a more complicated use case.

我建议您阅读全文搜索引擎与 DBMS。单行代码是:如果您的大部分用例是全文搜索,请使用 Lucene。如果您的大部分用例是连接和其他关系操作,请使用数据库。对于更复杂的用例,您可以使用混合解决方案。

回答by Joel

Use Lucene when you want to index textual Documents(of any length) and search for Textwithin those documents, returning a ranked list of documents that matched the search query. The classic example is search engines, like Google, that uses text indexers like Lucene to index and query the content of web pages.

当您想要索引文本文档(任何长度)并在这些文档中搜索文本时,请使用 Lucene ,返回与搜索查询匹配的文档排序列表。典型的例子是搜索引擎,比如谷歌,它使用像 Lucene 这样的文本索引器来索引和查询网页的内容。

The advantages of using Lucene over a database like Mysql, for indexing and searching text are:

在索引和搜索文本方面,相比 Mysql 等数据库,使用 Lucene 的优点是:

  • for the developer- tools to analyse, parse and index textual information (e.g. stemming, plurals, synonyms, tokenisation) in multiple languages. Lucene also scales very well for text search.
  • for the user- quality search results. Lucene uses a very good similarity function(to compare the search query against each document), at the heart of which are the Cosine Similarity and Inverse Term/Document frequency. This results in good search results with very little tweaking required upfront.
  • 对于开发人员- 分析、解析和索引多种语言的文本信息(例如词干、复数、同义词、标记化)的工具。Lucene 还可以很好地扩展文本搜索。
  • 对于用户- 质量搜索结果。Lucene 使用了一个非常好的相似度函数(将搜索查询与每个文档进行比较),其核心是余弦相似度和逆项/文档频率。这会产生良好的搜索结果,而无需预先进行很少的调整。

Lots of useful info on Lucene here.

这里有很多关于 Lucene有用信息

回答by Eugeniu Torica

We used Sql Server at work to make some queries which used Fulltext search. In case of big amounts of data Sql makes an inner join between result set returned by FullText search and the rest of the query which might be slow if database is running on the low powered machine (2GB ram for 20 GB of data). Switching the same query to Lucene improved speed considerably.

我们在工作中使用 Sql Server 进行一些使用全文搜索的查询。在大量数据的情况下,Sql 在全文搜索返回的结果集和查询的其余部分之间进行内部连接,如果数据库在低功率机器上运行(2GB ram 用于 20 GB 数据),这可能会很慢。将相同的查询切换到 Lucene 显着提高了速度。

回答by Harry Joy

Lucene search has a advantage of indexing. Thispost can help you understand lucene.

Lucene 搜索具有索引的优势。这篇文章可以帮助您了解 lucene。