Java 如何配置 SOLR 服务器以实现拼写检查功能
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/18611778/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to config SOLR server for spell check functionality
提问by MSA
I want to implement spellcheck functionality offered by Solr using MySql database, but I don't understand how.
Here the basic flow of what I want to do.
我想使用 MySql 数据库实现 Solr 提供的拼写检查功能,但我不明白如何。
这里是我想做的基本流程。
I have a simple inputText (in JSF) and if I type the word shwothe response to OutputLabel should be show.
我有一个简单的 inputText(在 JSF 中),如果我输入单词shwo ,则对OutputLabel的响应应该是show。
First of all I'm using the following tools and frameworks:
首先,我使用以下工具和框架:
JBoss application server 6.1.
Eclipse
JPA
JSF(Primefaces)
JBoss 应用服务器 6.1。
Eclipse
JPA
JSF(Primefaces)
Steps I've done until now:
到目前为止我已经完成的步骤:
Step 1:Download Solr server from: http://lucene.apache.org/solr/downloads.htmlExtract content.
第 1 步:从http://lucene.apache.org/solr/downloads.html下载 Solr 服务器 提取内容。
Step 2:Add to Envoierment variable(where you have the solr server):
第 2 步:添加到 Envoierment 变量(您拥有 solr 服务器的位置):
solr.solr.home=D:\JBOSS\solr-4.4.0\solr-4.4.0\example\solr
Step 3:
第 3 步:
Open solr war and to solr.war\WEB-INF\web.xml add env-entry - (the easy way)
打开 solr war 并在 solr.war\WEB-INF\web.xml 中添加 env-entry -(简单的方法)
<env-entry>
<env-entry-name>solr/home</env-entry-name>
<env-entry-value>D:\JBOSS\solr-4.4.0\solr-4.4.0\example\solr</env-entry-value>
<env-entry-type>java.lang.String</env-entry-type>
</env-entry>
OR import project change and bulid war.
或导入项目变更和 bulid 战争。
Step 4:Browser: localhost:8080/solr/
And the solr console appears.
Until now all works well.
第 4 步:浏览器:localhost:8080/solr/
出现 solr 控制台。到目前为止,一切正常。
I have found some usefull code (my opinion) that returns:
我发现了一些有用的代码(我的意见)返回:
[collection1] webapp=/solr path=/spell params={spellcheck=on&q=whatever&wt=javabin&qt=/spell&version=2&spellcheck.build=true} hits=0 status=0 QTime=16
[collection1] webapp=/solr path=/spell params={spellcheck=on&q=whatever&wt=javabin&qt=/spell&version=2&spellcheck.build=true} hits=0 status=0 QTime=16
Here is the code that gives the result from above:
这是给出上述结果的代码:
SolrServer solr;
try {
solr = new CommonsHttpSolrServer("http://localhost:8080/solr");
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("qt", "/spell");
params.set("q", "whatever");
params.set("spellcheck", "on");
params.set("spellcheck.build", "true");
QueryResponse response = solr.query(params);
SpellCheckResponse spellCheckResponse = response.getSpellCheckResponse();
if (!spellCheckResponse.isCorrectlySpelled()) {
for (Suggestion suggestion : spellCheckResponse.getSuggestions()) {
System.out.println("original token: " + suggestion.getToken() + " - alternatives: " + suggestion.getAlternatives());
}
}
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Also I added in data-config.xml
我还添加了 data-config.xml
<?xml version="1.0" encoding="UTF-8" ?>
<dataConfig>
<dataSource type="JdbcDataSource" name="altadict"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/myproject"
user="root"
password=""
/>
<document name="myproject">
<entity name="myproject" query="SELECT * FROM words">
<field column="Id" name="Id" />
<field column="Cuvint" name="Cuvint" />
<field column="TradDiac" name="TradDiac" />
<field column="Explicatie" name="Explicatie" />
<field column="TipCuvint" name="TipCuvint" />
<field column="ItalicParant" name="ItalicParant" />
</entity>
</document>
</dataConfig>
schema.xml
架构.xml
<field name="Id" type="tlong" indexed="true" stored="true" required="true"/>
<field name="Cuvint" type="string" indexed="true" stored="true" required="true"/>
<field name="TradDiac" type="string" indexed="true" stored="true" required="true"/>
<field name="Explicatie" type="string" indexed="true" stored="true"/>
<field name="TipCuvint" type="string" indexed="true" stored="true" required="true"/>
<field name="ItalicParant" type="string" indexed="true" stored="true"/>
solrconfig.xml
配置文件
<!-- altadict Request Handler -->
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">data-config.xml</str>
</lst>
</requestHandler>
<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="df">Cuvint</str>
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck">on</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">10</str>
<str name="spellcheck.maxCollations">5</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">string</str> <!-- Replace with Field Type of your schema -->
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">examplew</str> <!-- Replace with field name as per your scheme -->
<str name="spellcheckIndexDir">./spellchecker</str>
<str name="buildOnOptimize">true</str>
<str name="buildOnCommit">true</str>
</lst>
<!-- a spellchecker that uses a different distance measure -->
<lst name="spellchecker">
<str name="name">jarowinkler</str>
<str name="field">spell</str>
<str name="distanceMeasure">org.apache.lucene.search.spell.JaroWinklerDistance</str>
<str name="spellcheckIndexDir">./spellchecker2</str>
</lst>
</searchComponent>
and libs
和库
Questions:
问题:
1.How do I make the database connection whit my DB and search the content to see if there are any words that could match?
2.How do I make the configuration.(solr-config.xml,shema.xml...etc)?
3.How do I send a string from my view(xhtml) so that the solr server knows what he looks for?
4.How do I get the correct word from Cuvine database column, for example wodrI want solr to return word.
1.如何与我的数据库建立数据库连接并搜索内容以查看是否有任何可以匹配的单词?
2.如何进行配置(solr-config.xml,shema.xml...等)?
3.如何从我的视图(xhtml)发送一个字符串,以便solr服务器知道他在寻找什么?
4.如何从 Cuvine 数据库列中获取正确的单词,例如wodr我希望 solr 返回word。
I read all the information about solr but it's still unclear:
我阅读了有关 solr 的所有信息,但仍然不清楚:
Links:Main Page:
http://lucene.apache.org/solr/
链接:主页:http:
//lucene.apache.org/solr/
Main Page tutorial: http://lucene.apache.org/solr/4_4_0/tutorial.html
主页教程:http: //lucene.apache.org/solr/4_4_0/tutorial.html
Solr Wiki:
http://wiki.apache.org/solr/Solrj--- official solrj documentation
http://wiki.apache.org/solr/SpellCheckComponent
Solr 维基:
http: //wiki.apache.org/solr/Solrj --- solrj 官方文档
http://wiki.apache.org/solr/SpellCheckComponent
Solr config: http://wiki.apache.org/solr/SolrConfigXmlhttp://www.installationpage.com/solr/solr-configuration-tutorial-schema-solrconfig-xml/http://wiki.apache.org/solr/SchemaXml
Solr 配置:http: //wiki.apache.org/solr/SolrConfigXml http://www.installationpage.com/solr/solr-configuration-tutorial-schema-solrconfig-xml/ http://wiki.apache.org/ solr/SchemaXml
StackOverflow proof: Solr Did you mean (Spell check component)
StackOverflow 证明: Solr 你的意思是(拼写检查组件)
Solr Database Integration:
http://www.slideshare.net/th0masr/integrating-the-solr-search-engine
http://www.cabotsolutions.com/2009/05/using-solr-lucene-for-full-text-search-with-mysql-db/
Solr 数据库集成:
http: //www.slideshare.net/th0masr/integrating-the-solr-search-engine
http://www.cabotsolutions.com/2009/05/using-solr-lucene-for-full-text -search-with-mysql-db/
Solr Spell Check:
http://docs.lucidworks.com/display/solr/Spell+Checking
http://searchhub.org/2010/08/31/getting-started-spell-checking-with-apache-lucene-and-solr/
http://techiesinsight.blogspot.ro/2012/06/using-solr-spellchecker-from-java.html
http://blog.websolr.com/post/2748574298/spellcheck-with-solr-spellcheckcomponent
How to use SpellingResult class in SolrJ
Solr 拼写检查:
http: //docs.lucidworks.com/display/solr/Spell+Checking
http://searchhub.org/2010/08/31/getting-started-spell-checking-with-apache-lucene-and -solr/
http://techiesinsight.blogspot.ro/2012/06/using-solr-spellchecker-from-java.html
http://blog.websolr.com/post/2748574298/spellcheck-with-solr-spellcheckcomponent
如何在 SolrJ 中使用 SpellingResult 类
I really need your help.Regards.
我真的需要你的帮助。问候。
采纳答案by Jayendra
1.How do I make the database connection with my DB and search the content to see if there are any words that could match?
1.如何与我的数据库建立数据库连接并搜索内容以查看是否有匹配的单词?
You would need to Index the data from MySql to Solr.
This can either be done by build an app to read the records from MySql and feeding the data to Solr.
Or as already answered use Data Import Handler (DIH)which will enable you to
Make connection to MySql and load data and index it into Solr.
Also, enable you to do incremental updates
您需要将数据从 MySql 索引到 Solr。
这可以通过构建一个应用程序来从 MySql 读取记录并将数据提供给 Solr 来完成。
或者正如已经回答的那样,使用数据导入处理程序 (DIH),这将使您能够连接到 MySql 并加载数据并将其索引到 Solr。此外,使您能够进行增量更新
2.How do I make the configuration.(solr-config.xml,shema.xml...etc)?
2.如何进行配置(solr-config.xml,shema.xml...等)?
The field for Spell checker should be marked with Text analysis.
As your field is marked as string there is no tokenization.
Schema.xml
拼写检查器字段应标有文本分析。
由于您的字段被标记为字符串,因此没有标记化。
架构.xml
<field name="Cuvint" type="text" indexed="true" stored="true" required="true"/>
Also, for solrconfig.xml, replace the field you want to be consider for spell suggestion
此外,对于 solrconfig.xml,替换您想要考虑的拼写建议字段
<str name="field">examplew</str> <!-- Replace with field name as per your scheme -->
Check for the Example.
检查示例。
3.How do I send a string from my view(xhtml) so that the solr server knows what he looks for?
3.如何从我的视图(xhtml)发送一个字符串,以便solr服务器知道他在寻找什么?
Usually, we implement this feature with Search and Spell suggestion combined in Solr request.
When we don't get any results from Solr, we check if spell check suggestion are available and display as Did you mean
suggestion
Also, instead of waiting for the Spell suggestion, we provide type ahead suggestion to the User that prevents a round trip to the Server.
通常,我们在 Solr 请求中结合搜索和拼写建议来实现此功能。
当我们没有从 Solr 得到任何结果时,我们检查拼写检查建议是否可用并显示为Did you mean
建议 此外,我们不等待拼写建议,而是向用户提供提前输入建议,以防止往返服务器。
4.How do I get the correct word from Cuvine database column, for example wodr I want solr to return word.
4.如何从Cuvine数据库列中获取正确的单词,例如wodr我希望solr返回单词。
Check the Exampleto configure Spell Check and that should provide the suggestion.
检查示例以配置拼写检查,这应该提供建议。
回答by Artem Lukanin
Import your database into Solr using DataImoprtHandler
to be able to search spellings in Solr.
使用DataImoprtHandler
能够在 Solr 中搜索拼写将您的数据库导入Solr。