首页 > 解决方案 > 如何使用 Java 中的 Stanford CoreNLP 提取普通和复杂句子或句子文档(主语、宾语和谓语)的三元组?

问题描述

    import java.io.*;
    import java.util.*;
    import java.io.BufferedReader;
    import java.io.BufferedWriter;
    import java.io.FileReader;
    import java.io.FileWriter;
    import java.io.IOException;
    import java.io.PrintWriter;
    import edu.stanford.nlp.ie.util.RelationTriple;
    import edu.stanford.nlp.simple.*;
    import java.util.Collection;
    import java.util.Properties;


    public class OpenIEDemo {
          public static void main(String[] args) throws Exception {

                // Create a CoreNLP document
                Document doc = new Document("He received his academic diploma from the Swiss federal polytechnic school (later the Eidgenössische Technische Hochschule, ETH) in Zürich in 1900.");

                // Iterate over the sentences in the document
                for (Sentence sent : doc.sentences()) {
                  // Iterate over the triples in the sentence
                  for (RelationTriple triple : sent.openieTriples()) {
                    // Print the triple
                    System.out.println(triple.confidence + "\t" + "[Subject]" + 
                        triple.subjectLemmaGloss() + "\t" + "[Predicate]" + 
                        triple.relationLemmaGloss() + "\t" + "[Object]" +
                        triple.objectLemmaGloss());
                  }
                }
              }

}

输出:-
1.0 [主语]he [谓语]receive [宾语]he Diploma
1.0 [主语]he [谓语]receive [宾语]he Academic Diploma
1.0 [Subject]swiss Federal Polytechnic Sc​​hool [谓语]in [宾语]Zürich

但是期望的输出是
1.0 [主语]他[谓语]1900年从苏黎世的瑞士联邦理工学校获得[客体]学术文凭

标签: javastanford-nlptriples

解决方案


推荐阅读