java 你如何在Lucene中读取索引进行搜索?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16847857/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How do you read the index in Lucene to do a search?
提问by SaB
Lucene 4.3 Newbie
Lucene 4.3 新手
How do I get a simple search working in Lucene 4.3?
如何在 Lucene 4.3 中进行简单的搜索?
I adapted the outline at in a simple Java test case: http://lucene.apache.org/core/4_3_0/core/overview-summary.html#overview_description
我在一个简单的 Java 测试用例中修改了大纲:http: //lucene.apache.org/core/4_3_0/core/overview-summary.html#overview_description
The example starts with:
示例开始于:
DirectoryReader ireader = DirectoryReader.open(directory);
IndexSearcher isearcher = new IndexSearcher(ireader);
But DirectoryReader is not visible (protected) according to the docs. So it doesn't seem as if you can use the DirectoryReader.
但是根据文档, DirectoryReader 不可见(受保护)。因此,您似乎无法使用 DirectoryReader。
So I did digging and tried various permutations to avoid direct use of DirectoryReader including:
因此,我进行了挖掘并尝试了各种排列以避免直接使用 DirectoryReader,包括:
File indexdir = new File("D:\lucenetest\") ; // location of my index
Directory directory = FSDirectory.open(indexdir);
IndexReader ireader = IndexReader.open(FSDirectory.open(indexdir)); //ERROR NoSuchMethodError
//IndexReader ireader = IndexReader.open(directory); //variation ERROR NoSuchMethodError
IndexSearcher isearcher = new IndexSearcher(ireader);
Etc. (Including Trying AtomicReaders). Nothing seems to work. (I verified that Lucene Core is properly imported.) The indexing works fine.
等等(包括尝试 AtomicReaders)。似乎没有任何效果。(我验证了 Lucene Core 已正确导入。)索引工作正常。
I looked at the Lucene Sample Search Code for more clues. http://lucene.apache.org/core/4_2_1/demo/src-html/org/apache/lucene/demo/SearchFiles.html
我查看了 Lucene 示例搜索代码以获取更多线索。http://lucene.apache.org/core/4_2_1/demo/src-html/org/apache/lucene/demo/SearchFiles.html
IndexReader reader = DirectoryReader.open(FSDirectory.open(new File(index))); //DirectoryReader not visible error
IndexSearcher searcher = new IndexSearcher(reader);
This also does not work when used in a simple example file.
在简单的示例文件中使用时,这也不起作用。
I have been able to get simple indexing to work and previously was able to get the Lucene demo working (index and search). But, I cannot seem to get a simple search to work.
我已经能够使简单的索引工作,并且以前能够使 Lucene 演示工作(索引和搜索)。但是,我似乎无法进行简单的搜索。
Any clues?
有什么线索吗?
采纳答案by futuretelematics
I usually use this code... it's a class that encapsulates all the operations with the LuceneIndex (v4)
It uses near-real-time access to the index so nearly all updates are available to the index reader:
我通常使用这段代码......它是一个用 LuceneIndex (v4) 封装所有操作的类
它使用对索引的近实时访问,因此几乎所有更新都可供索引阅读器使用:
NOTE: It also uses lombok
注意:它还使用lombok
@Slf4j
public class LuceneIndex {
/////////////////////////////////////////////////////////////////////////////////////////
// STATUS (ver http://blog.mikemccandless.com/2011/11/near-real-time-readers-with-lucenes.html)
/////////////////////////////////////////////////////////////////////////////////////////
private final IndexWriter _indexWriter;
private final TrackingIndexWriter _trackingIndexWriter;
private final NRTManager _searchManager;
LuceneNRTReopenThread _reopenThread = null;
private long _reopenToken; // index update/delete methods returned token
/////////////////////////////////////////////////////////////////////////////////////////
// CONSTRUCTOR
/////////////////////////////////////////////////////////////////////////////////////////
/**
* Constructor en base a una instancia del tipo responsable de la persistencia del índice de lucene
*/
public LuceneIndex(final Directory luceneDirectory,
final Analyzer analyzer) {
try {
// Create the indexWriter
_indexWriter = new IndexWriter(luceneDirectory,
new IndexWriterConfig(LuceneConstants.VERSION,
analyzer));
_trackingIndexWriter = new NRTManager.TrackingIndexWriter(_indexWriter);
// Create the SearchManager to exec the search
_searchManager = new NRTManager(_trackingIndexWriter,
new SearcherFactory(),
true);
// Open the thread in charge of re-open the index to allow it to see real-time changes
// The index is refreshed every 60sc when nobody is waiting
// and every 100 millis whenever is someone waiting (see search method)
// (see http://lucene.apache.org/core/4_3_0/core/org/apache/lucene/search/NRTManagerReopenThread.html)
_reopenThread = new LuceneNRTReopenThread(_searchManager,
60.0, // when there is nobody waiting
0.1); // when there is someone waiting
_reopenThread.startReopening();
} catch (IOException ioEx) {
// if (luceneDirectory instanceof JdbcDirectory) {
// throw new IllegalStateException("The BBDD table for the lucene index could not be created: " + ioEx.getMessage(),ioEx);
// } else {
throw new IllegalStateException("Lucene index could not be created: " + ioEx.getMessage());
// }
}
}
/////////////////////////////////////////////////////////////////////////////////////////
// FINALIZADOR
/////////////////////////////////////////////////////////////////////////////////////////
@Override
protected void finalize() throws Throwable {
this.close();
super.finalize();
}
/**
* Closes every index
*/
public void close() {
try {
// stop the index reader re-open thread
_reopenThread.stopReopening();
_reopenThread.interrupt();
// Close the search manager
_searchManager.close();
// Close the indexWriter, commiting everithing that's pending
_indexWriter.commit();
_indexWriter.close();
} catch(IOException ioEx) {
log.error("Error while closing lucene index: {}",ioEx.getMessage(),
ioEx);
}
}
/////////////////////////////////////////////////////////////////////////////////////////
// REOPEN-THREAD: Thread in charge of re-open the IndexReader to have access to the
// latest IndexWriter changes
/////////////////////////////////////////////////////////////////////////////////////////
private class LuceneNRTReopenThread
extends NRTManagerReopenThread {
volatile boolean _finished = false;
public LuceneNRTReopenThread(final NRTManager manager,
final double targetMaxStaleSec,final double targetMinStaleSec) {
super(manager, targetMaxStaleSec, targetMinStaleSec);
this.setName("NRT Reopen Thread");
this.setPriority(Math.min(Thread.currentThread().getPriority()+2,
Thread.MAX_PRIORITY));
this.setDaemon(true);
}
public synchronized void startReopening() {
_finished = false;
this.start();
}
public synchronized void stopReopening() {
_finished = true;
}
@Override
public void run() {
while (!_finished) {
super.run();
}
}
}
/////////////////////////////////////////////////////////////////////////////////////////
//
/////////////////////////////////////////////////////////////////////////////////////////
/**
* Index a Lucene document
* @param doc the document to be indexed
*/
public void index(final Document doc) {
// Indexar en lucene
try {
_reopenToken = _trackingIndexWriter.addDocument(doc);
log.debug("document indexed in lucene");
} catch(IOException ioEx) {
log.error("Error while in Lucene index operation: {}",ioEx.getMessage(),
ioEx);
} finally {
try {
_indexWriter.commit();
} catch (IOException ioEx) {
log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),
ioEx);
}
}
}
/**
* Updates the index info for a lucene document
* @param doc the document to be indexed
*/
public void reIndex(final Term recordIdTerm,
final Document doc) {
// Indexar en lucene
try {
_reopenToken = _trackingIndexWriter.updateDocument(recordIdTerm,
doc);
log.debug("{} document re-indexed in lucene",recordIdTerm.text());
} catch(IOException ioEx) {
log.error("Error in lucene re-indexing operation: {}",ioEx.getMessage(),
ioEx);
} finally {
try {
_indexWriter.commit();
} catch (IOException ioEx) {
log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),
ioEx);
}
}
}
/**
* Unindex a lucene document
* @param idTerm term used to locate the document to be unindexed
* IMPORTANT! the term must filter only the document and only the document
* otherwise all matching docs will be unindexed
*/
public void unIndex(final Term idTerm) {
try {
_reopenToken = _trackingIndexWriter.deleteDocuments(idTerm);
log.debug("{}={} term matching records un-indexed from lucene",idTerm.field(),
idTerm.text());
} catch(IOException ioEx) {
log.error("Error in un-index lucene operation: {}",ioEx.getMessage(),
ioEx);
} finally {
try {
_indexWriter.commit();
} catch (IOException ioEx) {
log.error("Error while commiting changes to Lucene index: {}",ioEx.getMessage(),
ioEx);
}
}
}
/**
* Delete all lucene index docs
*/
public void truncate() {
try {
_reopenToken = _trackingIndexWriter.deleteAll();
log.warn("lucene index truncated");
} catch(IOException ioEx) {
log.error("Error truncating lucene index: {}",ioEx.getMessage(),
ioEx);
} finally {
try {
_indexWriter.commit();
} catch (IOException ioEx) {
log.error("Error truncating lucene index: {}",ioEx.getMessage(),
ioEx);
}
}
}
/////////////////////////////////////////////////////////////////////////////////////////
// COUNT-SEARCH
/////////////////////////////////////////////////////////////////////////////////////////
/**
* Count the number of results returned by a search against the lucene index
* @param qry the query
* @return
*/
public long count(final Query qry) {
long outCount = 0;
try {
_searchManager.waitForGeneration(_reopenToken); // wait untill the index is re-opened
IndexSearcher searcher = _searchManager.acquire();
try {
TopDocs docs = searcher.search(qry,0);
if (docs != null) outCount = docs.totalHits;
log.debug("count-search executed against lucene index returning {}",outCount);
} finally {
_searchManager.release(searcher);
}
} catch (IOException ioEx) {
log.error("Error re-opening the index {}",ioEx.getMessage(),
ioEx);
}
return outCount;
}
/**
* Executes a search query
* @param qry the query to be executed
* @param sortFields the search query criteria
* @param firstResultItemOrder the order number of the first element to be returned
* @param numberOfResults number of results to be returnee
* @return a page of search results
*/
public LucenePageResults search(final Query qry,Set<SortField> sortFields,
final int firstResultItemOrder,final int numberOfResults) {
LucenePageResults outDocs = null;
try {
_searchManager.waitForGeneration(_reopenToken); // wait until the index is re-opened for the last update
IndexSearcher searcher = _searchManager.acquire();
try {
// sort crieteria
SortField[] theSortFields = null;
if (CollectionUtils.hasData(sortFields)) theSortFields = CollectionUtils.toArray(sortFields,SortField.class);
Sort theSort = CollectionUtils.hasData(theSortFields) ? new Sort(theSortFields)
: null;
// number of results to be returned
int theNumberOfResults = firstResultItemOrder + numberOfResults;
// Exec the search (if the sort criteria is null, they're not used)
TopDocs scoredDocs = theSort != null ? searcher.search(qry,
theNumberOfResults,
theSort)
: searcher.search(qry,
theNumberOfResults);
log.debug("query {} {} executed against lucene index: returned {} total items, {} in this page",qry.toString(),
(theSort != null ? theSort.toString() : ""),
scoredDocs != null ? scoredDocs.totalHits : 0,
scoredDocs != null ? scoredDocs.scoreDocs.length : 0);
outDocs = LucenePageResults.create(searcher,
scoredDocs,
firstResultItemOrder,numberOfResults);
} finally {
_searchManager.release(searcher);
}
} catch (IOException ioEx) {
log.error("Error freeing the searcher {}",ioEx.getMessage(),
ioEx);
}
return outDocs;
}
/////////////////////////////////////////////////////////////////////////////////////////
// INDEX MAINTEINANCE
/////////////////////////////////////////////////////////////////////////////////////////
/**
* Mergest the lucene index segments into one
* (this should NOT be used, only rarely for index mainteinance)
*/
public void optimize() {
try {
_indexWriter.forceMerge(1);
log.debug("Lucene index merged into one segment");
} catch (IOException ioEx) {
log.error("Error optimizing lucene index {}",ioEx.getMessage(),
ioEx);
}
}
}
回答by lizzie
A very simple search can be performed using this sample code
使用此示例代码可以执行非常简单的搜索
// directory where your index is stored
File path = new File(" ... /solr/solr/Collection1/data/index");
Directory index = FSDirectory.open(path);
IndexReader reader = DirectoryReader.open(index);
IndexSearcher searcher = new IndexSearcher(reader);
Term t = new Term("myfield", "myvalue");
// Get the top 10 docs
Query query = new TermQuery(t);
TopDocs tops= searcher.search(query, 10);
ScoreDoc[] scoreDoc = tops.scoreDocs;
System.out.println(scoreDoc.length);
for (ScoreDoc score : scoreDoc){
System.out.println("DOC " + score.doc + " SCORE " + score.score);
}
// Get the frequency of the term
int freq = reader.docFreq(t);
System.out.println("FREQ " + freq);
`