java 将文档添加到 lucene 中的现有索引
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/4033774/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
adding documents to an existing index in lucene
提问by jacobian
I would like to ask of how to add new documents to an existing lucene index. in the source code below, I just change the paramater of IndexWriter into false.
我想问一下如何将新文档添加到现有的lucene索引中。在下面的源代码中,我只是将 IndexWriter 的参数更改为 false。
IndexWriter indexWriter = new IndexWriter(
FSDirectory.open(indexDir),
new SimpleAnalyzer(),
false,
IndexWriter.MaxFieldLength.LIMITED);
because false means that the index will still be open and not close. also to add new document I should use
因为 false 意味着索引仍将打开而不是关闭。还要添加我应该使用的新文档
indexWriter.addDocument(doc)
but my question is how exactly can I add new documents to an existing lucene index. I am a bit loss in finding out where to put a new path directory containing new Documents in lucene class so that lucene can index those new documents and add it into existing indexes. any help would be appreciated though. thanks.
但我的问题是如何将新文档添加到现有的 lucene 索引中。我在找出在 lucene 类中放置包含新文档的新路径目录的位置时有点不知所措,以便 lucene 可以索引这些新文档并将其添加到现有索引中。任何帮助将不胜感激。谢谢。
import org.apache.lucene.analysis.SimpleAnalyzer;
import org.apache.lucene.document.Document;
import org.apache.lucene.document.Field;
import org.apache.lucene.index.IndexWriter;
import org.apache.lucene.store.FSDirectory;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
public class testlucene1 {
public static void main(String[] args) throws Exception {
File indexDir = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
File dataDir = new File("C:/Users/Raden/Documents/lucene/LuceneHibernate/adi");
String suffix = "txt";
testlucene1 indexer = new testlucene1();
int numIndex = indexer.index(indexDir, dataDir, suffix);
System.out.println("Total files indexed " + numIndex);
}
private int index(File indexDir, File dataDir, String suffix) throws Exception {
IndexWriter indexWriter = new IndexWriter(
FSDirectory.open(indexDir),
new SimpleAnalyzer(),
false,
IndexWriter.MaxFieldLength.LIMITED);
indexWriter.setUseCompoundFile(false);
indexDirectory(indexWriter, dataDir, suffix);
int numIndexed = indexWriter.maxDoc();
indexWriter.optimize();
indexWriter.close();
return numIndexed;
}
private void indexDirectory(IndexWriter indexWriter, File dataDir, String suffix) throws IOException {
File[] files = dataDir.listFiles();
for (int i = 0; i < files.length; i++) {
File f = files[i];
if (f.isDirectory()) {
indexDirectory(indexWriter, f, suffix);
} else {
indexFileWithIndexWriter(indexWriter, f, suffix);
}
}
}
private void indexFileWithIndexWriter(IndexWriter indexWriter, File f, String suffix) throws IOException {
if (f.isHidden() || f.isDirectory() || !f.canRead() || !f.exists()) {
return;
}
if (suffix != null && !f.getName().endsWith(suffix)) {
return;
}
System.out.println("Indexing file " + f.getCanonicalPath());
Document doc = new Document();
doc.add(new Field("contents", new FileReader(f)));
doc.add(new Field("filename", f.getCanonicalPath(), Field.Store.YES, Field.Index.ANALYZED));
indexWriter.addDocument(doc);
}
}
采纳答案by recursive9
also to add new document I should use .... but my question is how exactly can I add new documents to an existing lucene index
还要添加我应该使用的新文档......但我的问题是我究竟如何将新文档添加到现有的 lucene 索引中
can you please clarify what you mean? you know how to add documents to an index, as you stated, but then you ask how to... add new documents?
你能澄清一下你的意思吗?正如您所说,您知道如何将文档添加到索引中,但随后您会问如何...添加新文档?
回答by XL Zheng
Based on Lucene API, when you construction the IndexWriter
, the constructor allow you specify the IndexWriterConfig
.
基于 Lucene API,当您构造 时IndexWriter
,构造函数允许您指定IndexWriterConfig
.
IndexWriter(Directory d, IndexWriterConfig conf)
IndexWriterConfig
allows you specify the open mode:
IndexWriterConfig
允许您指定打开模式:
IndexWriterConfig conf = new IndexWriterConfig(analyzer);
conf.setOpenMode(IndexWriterConfig.OpenMode.APPEND);
And you have 3 options:
你有 3 个选择:
- IndexWriterConfig.OpenMode.APPEND
- IndexWriterConfig.OpenMode.CREATE
- IndexWriterConfig.OpenMode.CREATE_OR_APPEND
- IndexWriterConfig.OpenMode.APPEND
- IndexWriterConfig.OpenMode.CREATE
- IndexWriterConfig.OpenMode.CREATE_OR_APPEND
回答by Xodarap
When you instantiate a new IndexWriter
, you will not create a new index (unless you explicitly tell lucene to force a new one). So your code will work, regardless of whether the index already exists.
当您实例化一个 new 时IndexWriter
,您不会创建一个新索引(除非您明确告诉 lucene 强制创建一个新索引)。因此,无论索引是否已存在,您的代码都将起作用。