Java 执行和测试 stanford core nlp 示例
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/20359346/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Executing and testing stanford core nlp example
提问by user123
I downloaded stanford core nlp packages and tried to test it on my machine.
我下载了 stanford core nlp 包并尝试在我的机器上测试它。
Using command: java -cp "*" -mx1g edu.stanford.nlp.sentiment.SentimentPipeline -file input.txt
使用命令: java -cp "*" -mx1g edu.stanford.nlp.sentiment.SentimentPipeline -file input.txt
I got sentiment result in form of positive
or negative
. input.txt
contains the sentence to be tested.
我得到了形式为positive
or 的情绪结果negative
。input.txt
包含要测试的句子。
On more command: java -cp stanford-corenlp-3.3.0.jar;stanford-corenlp-3.3.0-models.jar;xom.jar;joda-time.jar -Xmx600m edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,parse -file input.txt
when executed gives follwing lines :
在更多命令上:java -cp stanford-corenlp-3.3.0.jar;stanford-corenlp-3.3.0-models.jar;xom.jar;joda-time.jar -Xmx600m edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,parse -file input.txt
执行时给出以下行:
H:\Drive E\Stanford\stanfor-corenlp-full-2013~>java -cp stanford-corenlp-3.3.0.j
ar;stanford-corenlp-3.3.0-models.jar;xom.jar;joda-time.jar -Xmx600m edu.stanford
.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,parse -file
input.txt
Adding annotator tokenize
Adding annotator ssplit
Adding annotator pos
Reading POS tagger model from edu/stanford/nlp/models/pos-tagger/english-left3wo
rds/english-left3words-distsim.tagger ... done [36.6 sec].
Adding annotator lemma
Adding annotator parse
Loading parser from serialized file edu/stanford/nlp/models/lexparser/englishPCF
G.ser.gz ... done [13.7 sec].
Ready to process: 1 files, skipped 0, total 1
Processing file H:\Drive E\Stanford\stanfor-corenlp-full-2013~\input.txt ... wri
ting to H:\Drive E\Stanford\stanfor-corenlp-full-2013~\input.txt.xml {
Annotating file H:\Drive E\Stanford\stanfor-corenlp-full-2013~\input.txt [13.6
81 seconds]
} [20.280 seconds]
Processed 1 documents
Skipped 0 documents, error annotating 0 documents
Annotation pipeline timing information:
PTBTokenizerAnnotator: 0.4 sec.
WordsToSentencesAnnotator: 0.0 sec.
POSTaggerAnnotator: 1.8 sec.
MorphaAnnotator: 2.2 sec.
ParserAnnotator: 9.1 sec.
TOTAL: 13.6 sec. for 10 tokens at 0.7 tokens/sec.
Pipeline setup: 58.2 sec.
Total time for StanfordCoreNLP pipeline: 79.6 sec.
H:\Drive E\Stanford\stanfor-corenlp-full-2013~>
Could understand. No informative result.
可以理解。没有信息性结果。
I got one example at : stanford core nlp java output
我有一个例子:stanford core nlp java output
import java.io.*;
import java.util.*;
import edu.stanford.nlp.io.*;
import edu.stanford.nlp.ling.*;
import edu.stanford.nlp.pipeline.*;
import edu.stanford.nlp.trees.*;
import edu.stanford.nlp.util.*;
public class StanfordCoreNlpDemo {
public static void main(String[] args) throws IOException {
PrintWriter out;
if (args.length > 1) {
out = new PrintWriter(args[1]);
} else {
out = new PrintWriter(System.out);
}
PrintWriter xmlOut = null;
if (args.length > 2) {
xmlOut = new PrintWriter(args[2]);
}
StanfordCoreNLP pipeline = new StanfordCoreNLP();
Annotation annotation;
if (args.length > 0) {
annotation = new Annotation(IOUtils.slurpFileNoExceptions(args[0]));
} else {
annotation = new Annotation("Kosgi Santosh sent an email to Stanford University. He didn't get a reply.");
}
pipeline.annotate(annotation);
pipeline.prettyPrint(annotation, out);
if (xmlOut != null) {
pipeline.xmlPrint(annotation, xmlOut);
}
// An Annotation is a Map and you can get and use the various analyses individually.
// For instance, this gets the parse tree of the first sentence in the text.
List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
if (sentences != null && sentences.size() > 0) {
CoreMap sentence = sentences.get(0);
Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
out.println();
out.println("The first sentence parsed is:");
tree.pennPrint(out);
}
}
}
Tried to execute it in netbeans with including necessary library. But it always stuck in between or gives exception Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
尝试在包含必要库的 netbeans 中执行它。但它总是卡在两者之间或给出例外Exception in thread “main” java.lang.OutOfMemoryError: Java heap space
Thou I set the memory to be allocated in property/run/VM box
你我设置了要分配的内存 property/run/VM box
Any idea how can I run above java example using command line?
知道如何使用命令行在 java 示例之上运行吗?
I want to get sentiment score of the example
我想获得示例的情绪分数
UPDATE
更新
output of : java -cp "*" -mx1g edu.stanford.nlp.sentiment.SentimentPipeline -file input.txt
的输出: java -cp "*" -mx1g edu.stanford.nlp.sentiment.SentimentPipeline -file input.txt
out put of: java -cp stanford-corenlp-3.3.0.j
ar;stanford-corenlp-3.3.0-models.jar;xom.jar;joda-time.jar -Xmx600m edu.stanford
.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,parse -file
input.txt
输出: java -cp stanford-corenlp-3.3.0.j
ar;stanford-corenlp-3.3.0-models.jar;xom.jar;joda-time.jar -Xmx600m edu.stanford
.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,parse -file
input.txt
采纳答案by lababidi
You need to add the "sentiment" annotator to the list of annotators:
您需要将“情绪”注释器添加到注释器列表中:
-annotators tokenize,ssplit,pos,lemma,parse,sentiment
This will add a "sentiment" property to each sentence node in your XML.
这将为 XML 中的每个句子节点添加一个“情绪”属性。
回答by Elliott Frisch
Per the example hereyou need to run the Sentiment Analysis.
java -cp "*" -mx5g edu.stanford.nlp.sentiment.SentimentPipeline -file input.txt
Apparently this is a memory expensive operation, it may not complete with only 1 gigabyte. Then you can use the "Evaluation Tool"
显然这是一个内存昂贵的操作,它可能无法仅用 1 GB 来完成。然后你可以使用“评估工具”
java -cp "*" edu.stanford.nlp.sentiment.Evaluate edu/stanford/nlp/models/sentiment/sentiment.ser.gz input.txt
回答by saganas
You can do the following in your code:
您可以在代码中执行以下操作:
String text = "I am feeling very sad and frustrated.";
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, sentiment");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
<...>
Annotation annotation = pipeline.process(text);
List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
String sentiment = sentence.get(SentimentCoreAnnotations.SentimentClass.class);
System.out.println(sentiment + "\t" + sentence);
}
It will print the sentiment of the sentence and the sentence itself, e.g. "I am feeling very sad and frustrated.":
它将打印句子的情绪和句子本身,例如“我感到非常悲伤和沮丧。”:
Negative I am feeling very sad and frustrated.
回答by 53by97
This is working fine for me -
这对我来说很好用 -
Maven Dependencies :
Maven 依赖项:
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.2</version>
<classifier>models</classifier>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-corenlp</artifactId>
<version>3.5.2</version>
</dependency>
<dependency>
<groupId>edu.stanford.nlp</groupId>
<artifactId>stanford-parser</artifactId>
<version>3.5.2</version>
</dependency>
Java Code :
代码:
public static void main(String[] args) throws IOException {
String text = "This World is an amazing place";
Properties props = new Properties();
props.setProperty("annotators", "tokenize, ssplit, pos, lemma, parse, sentiment");
StanfordCoreNLP pipeline = new StanfordCoreNLP(props);
Annotation annotation = pipeline.process(text);
List<CoreMap> sentences = annotation.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
String sentiment = sentence.get(SentimentCoreAnnotations.SentimentClass.class);
System.out.println(sentiment + "\t" + sentence);
}
}
Results :
结果 :
Very positive This World is an amazing place
非常积极 这个世界是一个了不起的地方