Java 没有找到段*文件

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/3802021/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-14 05:10:29  来源:igfitidea点击:

no segments* file found

javalucenenutch

提问by crazyaboutliv

I need to access a lucene index ( created by crawling several webpages using Nutch) but it is giving the error shown above :

我需要访问一个 lucene 索引(通过使用 Nutch 抓取多个网页创建),但它给出了上面显示的错误:

java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.FSDirectory@/home/<path>: files:
    at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:516)
    at org.apache.lucene.index.IndexReader.open(IndexReader.java:185)
    at org.apache.lucene.index.IndexReader.open(IndexReader.java:148)
    at DictionaryGenerator.generateDict(DictionaryGenerator.java:24)
    at DictionaryGenerator.main(DictionaryGenerator.java:56)

I googled but the reasons given were not matching the requirements. The fact that files are being shown ( the path) probably means that the directory is not empty.
Thanks

我用谷歌搜索,但给出的原因与要求不符。显示文件(路径)的事实可能意味着目录不为空。
谢谢

采纳答案by Yuval F

Basically, the error message says that Lucene did not find the proper files in the index directory. I suggest checking the following:

基本上,错误消息表明 Lucene 在索引目录中没有找到正确的文件。我建议检查以下内容:

  1. Verify the path of the index directory fits what you think it should be.
  2. Do the Nutch and Lucene versions used match? This may stem from a version difference.
  3. Is there a permissions issue? Can you read the files in the directory?
  4. Try looking at the index using Luke. If you cannot, there is probably some corruption in the index.
  1. 验证索引目录的路径是否符合您的预期。
  2. 使用的 Nutch 和 Lucene 版本是否匹配?这可能源于版本差异。
  3. 是否存在权限问题?你能读取目录中的文件吗?
  4. 尝试使用Luke查看索引。如果不能,则索引中可能存在一些损坏。

If all these do not help, Please post the indexing part of the code.

如果所有这些都没有帮助,请发布代码的索引部分。

回答by nir

Another hint, as I was having the same error and found that after creating indexes I did not close IndexWriter and it proved very unforgiven. In my indexdirectory I have some .lock files and no segments or segments.gen files which is what Reader is looking for. See here#3 for details

另一个提示,因为我遇到了同样的错误,并发现在创建索引后我没有关闭 IndexWriter,事实证明这是非常不可原谅的。在我的索引目录中,我有一些 .lock 文件,但没有 Reader 正在寻找的段或段 .gen 文件。详情请看这里#3