如何在我的 Java 程序中集成 stanford 解析器软件?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/19429106/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-12 17:04:26  来源:igfitidea点击:

How can I integrate stanford parser software in my java program?

javaparsingjungjgraph

提问by Vishnu Kumar

I have to develop a project in java that uses a Stanford parser to separate the sentences and has to generate a graph that shows the relation between the words in a sentence. for example: Ohio is located in America. Output:

我必须在 java 中开发一个项目,该项目使用斯坦福解析器来分隔句子,并且必须生成一个图表来显示句子中单词之间的关系。例如:俄亥俄州位于美国。输出:

output

输出

the image shows the graph. But the output need not be the same but it has to show relation between the words in graph form. The graph can be generated using Jgraph, Jung. But initially I have to integrate the parser software into my program. So how can I integrate a parser??

图像显示了图表。但输出不必相同,但必须以图形形式显示单词之间的关系。该图可以使用 Jgraph, Jung 生成。但最初我必须将解析器软件集成到我的程序中。那么如何集成解析器呢?

采纳答案by user278064

  • Download the Stanford Parser zip:
  • Add jarsto the build pathof your project (include the model files)
  • Use the following snippet to parse sentences and return the constituency trees:(dependency trees can be built by inspecting the structure of the tree)

    import java.io.StringReader;
    import java.util.List;
    
    import edu.stanford.nlp.ling.CoreLabel;
    import edu.stanford.nlp.process.TokenizerFactory;
    import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
    import edu.stanford.nlp.process.CoreLabelTokenFactory;
    import edu.stanford.nlp.process.PTBTokenizer;
    import edu.stanford.nlp.process.Tokenizer;
    import edu.stanford.nlp.trees.Tree;
    
    class Parser {
    
        private final static String PCG_MODEL = "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz";        
    
        private final TokenizerFactory<CoreLabel> tokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(), "invertible=true");
    
        private final LexicalizedParser parser = LexicalizedParser.loadModel(PCG_MODEL);
    
        public Tree parse(String str) {                
            List<CoreLabel> tokens = tokenize(str);
            Tree tree = parser.apply(tokens);
            return tree;
        }
    
        private List<CoreLabel> tokenize(String str) {
            Tokenizer<CoreLabel> tokenizer =
                tokenizerFactory.getTokenizer(
                    new StringReader(str));    
            return tokenizer.tokenize();
        }
    
        public static void main(String[] args) { 
            String str = "My dog also likes eating sausage.";
            Parser parser = new Parser(); 
            Tree tree = parser.parse(str);  
    
            List<Tree> leaves = tree.getLeaves();
            // Print words and Pos Tags
            for (Tree leaf : leaves) { 
                Tree parent = leaf.parent(tree);
                System.out.print(leaf.label().value() + "-" + parent.label().value() + " ");
            }
            System.out.println();               
        }
    }
    
  • 下载斯坦福解析器 zip
  • jars添加到项目的构建路径中(包括模型文件)
  • 使用以下片段解析句子并返回选区树:(可以通过检查树的结构来构建依赖树)

    import java.io.StringReader;
    import java.util.List;
    
    import edu.stanford.nlp.ling.CoreLabel;
    import edu.stanford.nlp.process.TokenizerFactory;
    import edu.stanford.nlp.parser.lexparser.LexicalizedParser;
    import edu.stanford.nlp.process.CoreLabelTokenFactory;
    import edu.stanford.nlp.process.PTBTokenizer;
    import edu.stanford.nlp.process.Tokenizer;
    import edu.stanford.nlp.trees.Tree;
    
    class Parser {
    
        private final static String PCG_MODEL = "edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz";        
    
        private final TokenizerFactory<CoreLabel> tokenizerFactory = PTBTokenizer.factory(new CoreLabelTokenFactory(), "invertible=true");
    
        private final LexicalizedParser parser = LexicalizedParser.loadModel(PCG_MODEL);
    
        public Tree parse(String str) {                
            List<CoreLabel> tokens = tokenize(str);
            Tree tree = parser.apply(tokens);
            return tree;
        }
    
        private List<CoreLabel> tokenize(String str) {
            Tokenizer<CoreLabel> tokenizer =
                tokenizerFactory.getTokenizer(
                    new StringReader(str));    
            return tokenizer.tokenize();
        }
    
        public static void main(String[] args) { 
            String str = "My dog also likes eating sausage.";
            Parser parser = new Parser(); 
            Tree tree = parser.parse(str);  
    
            List<Tree> leaves = tree.getLeaves();
            // Print words and Pos Tags
            for (Tree leaf : leaves) { 
                Tree parent = leaf.parent(tree);
                System.out.print(leaf.label().value() + "-" + parent.label().value() + " ");
            }
            System.out.println();               
        }
    }