MySQL 选择独立的全文搜索服务器:Sphinx 还是 SOLR?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/1284083/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-31 13:53:34  来源:igfitidea点击:

Choosing a stand-alone full-text search server: Sphinx or SOLR?

mysqlfull-text-searchlucenesolrsphinx

提问by knorv

I'm looking for a stand-alone full-text search server with the following properties:

我正在寻找具有以下属性的独立全文搜索服务器:

  • Must operate as a stand-alone server that can serve search requests from multiple clients
  • Must be able to do "bulk indexing" by indexing the result of an SQL query: say "SELECT id, text_to_index FROM documents;"
  • Must be free software and must run on Linux with MySQL as the database
  • Must be fast (rules out MySQL's internal full-text search)
  • 必须作为独立服务器运行,可以为来自多个客户端的搜索请求提供服务
  • 必须能够通过索引 SQL 查询的结果来进行“批量索引”:比如“SELECT id, text_to_index FROM documents;”
  • 必须是免费软件,并且必须在 Linux 上以 MySQL 作为数据库运行
  • 一定要快(排除MySQL内部全文搜索)

The alternatives I've found that have these properties are:

我发现具有这些属性的替代方法是:

  • Solr (based on Lucene)
  • ElasticSearch (also based on Lucene)
  • Sphinx
  • Solr(基于Lucene)
  • ElasticSearch(同样基于 Lucene)
  • 狮身人面像

My questions:

我的问题:

  • How do they compare?
  • Have I missed any alternatives?
  • I know that each use case is different, but are there certain cases where I would definitely notwant to use a certain package?
  • 他们如何比较?
  • 我错过了任何选择吗?
  • 我知道,每个用例是不同的,但是否有某些情况下,我肯定希望使用某个软件包?

回答by Mauricio Scheffer

I've been using Solr successfully for almost 2 years now, and have never used Sphinx, so I'm obviously biased. However, I'll try to keep it objective by quoting the docs or other people. I'll also take patches to my answer :-)

我已经成功使用 Solr 近 2 年了,但从未使用过 Sphinx,所以我显然有偏见。但是,我会尝试通过引用文档或其他人来保持其客观性。我也会为我的答案打补丁:-)

Similarities:

相似之处:

  • Both Solr and Sphinx satisfy all of your requirements. They're fast and designed to index and search large bodies of data efficiently.
  • Both have a long list of high-traffic sites using them (Solr, Sphinx)
  • Both offer commercial support. (Solr, Sphinx)
  • Both offer client API bindings for several platforms/languages (Sphinx, Solr)
  • Both can be distributed to increase speed and capacity (Sphinx, Solr)
  • Solr 和 Sphinx 都能满足您的所有要求。它们速度很快,旨在高效地索引和搜索大量数据。
  • 两者都有一长串使用它们的高流量站点(SolrSphinx
  • 两者都提供商业支持。(Solr狮身人面像
  • 两者都为多种平台/语言(SphinxSolr)提供客户端 API 绑定
  • 两者都可以分布式以提高速度和容量(SphinxSolr

Here are some differences:

以下是一些差异:

Related questions:

相关问题:

回答by larf311

Unless you need to extend the search functionality in any proprietary way, Sphinx is your best bet.

除非您需要以任何专有方式扩展搜索功能,否则 Sphinx 是您最好的选择。

Sphinx advantages:

狮身人面像优点:

  1. Development and setup is faster
  2. Much better (and faster) aggregation. This was the killer feature for us.
  3. Not XML. This is what ultimately ruled out Solr for us. We had to return rather large result sets (think hundreds of results) and then aggregate them ourselves since Solr aggregation was lacking. The amount of time to serialize to and from XML just absolutely killed performance. For small results sets though, it was perfectly fine.
  4. Best documentation I've seen in an open source app
  1. 开发和设置更快
  2. 更好(更快)的聚合。这是我们的杀手锏。
  3. 不是 XML。这就是最终为我们排除了 Solr 的原因。由于缺少 Solr 聚合,我们不得不返回相当大的结果集(想想数百个结果)然后自己聚合它们。与 XML 进行序列化的时间量绝对会降低性能。不过对于小的结果集来说,这完全没问题。
  4. 我在开源应用程序中看到的最好的文档

Solr advantages:

Solr的优点:

  1. Can be extended.
  2. Can hit it directly from a web app, i.e., you can have autocomplete-like searches hit the Solr server directly via AJAX.
  1. 可以延长。
  2. 可以直接从 Web 应用程序访问它,即,您可以通过 AJAX 将类似自动完成的搜索直接访问 Solr 服务器。

回答by Augiwan

Note: There are many users with the same question in mind.

注意:有很多用户有同样的问题。

So, to answer to the point:

所以,回答这个问题:

Which and why?

哪个和为什么?

  • Use Solrif you intend to use it in your web-app(example-site search engine). It will definitely turn out to be great, thanks to its API. You will definitely need that power for a web-app.

  • Use Sphinxif you want to search through tons of documents/files real quick. It indexes real fast too. I would recommend not to use it in an app that involves JSON or parsing XML to get the search results. Use it for direct dB searches. It works great on MySQL.

  • 如果您打算在您的网络应用程序(示例站点搜索引擎)中使用Solr,请使用它。多亏了它的 API,它肯定会变得很棒。您肯定会需要 Web 应用程序的这种功能。

  • 如果您想快速搜索大量文档/文件,请使用Sphinx。它的索引速度也非常快。我建议不要在涉及 JSON 或解析 XML 以获取搜索结果的应用程序中使用它。将其用于直接 dB 搜索。它在 MySQL 上工作得很好。

Alternatives

备择方案

Although these are the giants, there are plenty more. Also, there are those that use these to power their custom frameworks. So, i would say that you really haven't missed any. Although there is one elasticsearchthat has a good user base.

虽然这些是巨人,但还有更多。此外,有些人使用这些来支持他们的自定义框架。所以,我会说你真的没有错过任何一个。尽管有一个elasticsearch拥有良好的用户群。

回答by lo_fye

I have been using Sphinx for almost a year now, and it has been amazing. I can index 1.5 million documents in about a minute on my MacBook, and even quicker on the server. I am also using Sphinx to limit searches to places within specific latitudes & longitudes, and it is very fast. Also, how results are ranked is very tweakable. Easy to install & setup, if you read a tutorial or two. Almost 1.0 status, but their Release Candidates have been rock solid.

我已经使用 Sphinx 将近一年了,它非常棒。我可以在我的 MacBook 上在一分钟内索引 150 万个文档,在服务器上甚至更快。我还使用 Sphinx 将搜索限制在特定纬度和经度内的地方,而且速度非常快。此外,结果的排名方式是非常可调整的。如果您阅读一两个教程,则易于安装和设置。几乎是 1.0 状态,但他们的候选版本一直坚如磐石。

回答by Angsuman Chakraborty

Lucene / Solr appears to be more featured and with longer years in business and a much stronger user community. imho if you can get past the initial setup issues as some seems to have faced (not we) then I would say Lucene / Solr is your best bet.

Lucene / Solr 似乎更有特色,业务时间更长,用户社区也更强大。恕我直言,如果你能解决一些似乎已经面临的初始设置问题(不是我们),那么我会说 Lucene / Solr 是你最好的选择。