Java 创建 StanfordCoreNLP 对象时出错

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/22206095/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-13 14:18:36  来源:igfitidea点击:

Error in creating the StanfordCoreNLP object

javamavenjarnlpstanford-nlp

提问by Lohath Unique

I have downloaded and installed required jar files from http://nlp.stanford.edu/software/corenlp.shtml#Download.

我已经从http://nlp.stanford.edu/software/corenlp.shtml#Download下载并安装了所需的 jar 文件。

I have include the five jar files

我已经包含了五个 jar 文件

Satnford-postagger.jar

Satnford-postagger.jar

Stanford-psotagger-3.3.1.jar

斯坦福-psotagger-3.3.1.jar

Stanford-psotagger-3.3.1.jar-javadoc.jar

Stanford-psotagger-3.3.1.jar-javadoc.jar

Stanford-psotagger-3.3.1.jar-src.jar

斯坦福-psotagger-3.3.1.jar-src.jar

stanford-corenlp-3.3.1.jar

stanford-corenlp-3.3.1.jar

and the code is

代码是

public class lemmafirst {

    protected StanfordCoreNLP pipeline;

    public lemmafirst() {
        // Create StanfordCoreNLP object properties, with POS tagging
        // (required for lemmatization), and lemmatization
        Properties props;
        props = new Properties();
        props.put("annotators", "tokenize, ssplit, pos, lemma");

        /*
         * This is a pipeline that takes in a string and returns various analyzed linguistic forms. 
         * The String is tokenized via a tokenizer (such as PTBTokenizerAnnotator), 
         * and then other sequence model style annotation can be used to add things like lemmas, 
         * POS tags, and named entities. These are returned as a list of CoreLabels. 
         * Other analysis components build and store parse trees, dependency graphs, etc. 
         * 
         * This class is designed to apply multiple Annotators to an Annotation. 
         * The idea is that you first build up the pipeline by adding Annotators, 
         * and then you take the objects you wish to annotate and pass them in and 
         * get in return a fully annotated object.
         * 
         *  StanfordCoreNLP loads a lot of models, so you probably
         *  only want to do this once per execution
         */
        ***this.pipeline = new StanfordCoreNLP(props);***
}

My Problem is in creating a the pipline.

我的问题是创建管道。

The ERROR that i got is:

我得到的错误是:

Exception in thread "main" java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.create(StanfordCoreNLP.java:563)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:262)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:129)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:125)
    at lemmafirst.<init>(lemmafirst.java:39)
    at lemmafirst.main(lemmafirst.java:83)
Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:758)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:289)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:253)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:88)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:76)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.create(StanfordCoreNLP.java:561)
    ... 6 more
Caused by: java.io.IOException: Unable to resolve "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as either class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:434)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:753)
    ... 11 more

Can anyone please correct the errors? Thank you

任何人都可以请纠正错误吗?谢谢

回答by Christopher Schr?der

The exception which is thrown is due to the missing pos model. This is because there are downloadable versions with and without the model files.

抛出的异常是由于缺少 pos 模型。这是因为有带和不带模型文件的可下载版本。

Either you add stanford-postagger-full-3.3.1.jar which can be found on the following page (stanford-postagger-full-2014-01-04.zip): http://nlp.stanford.edu/software/tagger.shtml.

您可以添加 stanford-postagger- full-3.3.1.jar,可以在以下页面 (stanford-postagger-full-2014-01-04.zip) 上找到:http: //nlp.stanford.edu/software/标签.shtml

Or you do the same for the whole CoreNLP Package (stanford-corenlp-full....jar): http://nlp.stanford.edu/software/corenlp.shtml(Then you can drop all the postagger depenedencies too, they are included in CoreNLP)

或者你对整个 CoreNLP 包(stanford-corenlp- full....jar)做同样的事情:http: //nlp.stanford.edu/software/corenlp.shtml (然后你也可以删除所有的 postagger 依赖,他们包含在 CoreNLP 中)

In case you only want to add the model files, look at Maven Centraland download "stanford-corenlp-3.3.1-models.jar".

如果您只想添加模型文件,请查看Maven Central并下载“stanford-corenlp-3.3.1-models.jar”。

回答by Sruthi Poddutur

An easier way to add those model files is to simply add following dependencies in your pom.xml and let maven manage it for you:

添加这些模型文件的一种更简单的方法是简单地在 pom.xml 中添加以下依赖项,并让 maven 为您管理它:

<dependency>
  <groupId>edu.stanford.nlp</groupId>
  <artifactId>stanford-corenlp</artifactId>
  <version>3.6.0</version>
</dependency>
<dependency>
  <groupId>edu.stanford.nlp</groupId>
  <artifactId>stanford-corenlp</artifactId>
  <version>3.6.0</version>
  <classifier>models</classifier> <!--  will get the dependent model jars -->
</dependency>

回答by raviraja

If anyone looking for gradle dependencies, add the following under dependencies.

如果有人在寻找 gradle 依赖项,请在依赖项下添加以下内容。

 compile group: 'edu.stanford.nlp', name: 'stanford-corenlp', version: '3.9.1'
 compile group: 'edu.stanford.nlp', name: 'stanford-corenlp', version: '3.9.1', classifier: 'models'