如何使用Stanford Parser或Stanford CoreNLP找到名词短语的语法关系



我正在使用stanford CoreNLP来查找名词短语的语法关系。

这里有一个例子:

考虑到一句话"健身室很脏。">

我设法把"健身室"确定为我的目标名词短语。我现在正在寻找一种方法来发现"脏"形容词与"健身室"有关系,而不仅仅与"房间"有关系。

示例代码:

private static void doSentenceTest(){
Properties props = new Properties();
props.put("annotators","tokenize, ssplit, pos, lemma, ner, parse, dcoref");
StanfordCoreNLP stanford = new StanfordCoreNLP(props);
TregexPattern npPattern = TregexPattern.compile("@NP");
String text = "The fitness room was dirty.";

// create an empty Annotation just with the given text
Annotation document = new Annotation(text);
// run all Annotators on this text
stanford.annotate(document);
List<CoreMap> sentences = document.get(CoreAnnotations.SentencesAnnotation.class);
for (CoreMap sentence : sentences) {
Tree sentenceTree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
TregexMatcher matcher = npPattern.matcher(sentenceTree);
while (matcher.find()) {
//this tree should contain "The fitness room" 
Tree nounPhraseTree = matcher.getMatch();
//Question : how do I find that "dirty" has a relationship to the nounPhraseTree

}
// Output dependency tree
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(sentenceTree);
Collection<TypedDependency> tdl = gs.typedDependenciesCollapsed();
System.out.println("typedDependencies: "+tdl); 
}
}

我在句子上使用了Stanford CoreNLP提取了它的根Tree对象。在这个树对象上,我使用TreexPattern和TreexMatcher提取了名词短语。这给了我一个包含实际名词短语的子树。我想知道的是在原句中找到名词短语的修饰语。

typedDependencies组给了我以下内容:

typedDependencies: [det(room-3, The-1), nn(room-3, fitness-2), nsubj(dirty-5, room-3), cop(dirty-5, was-4), root(ROOT-0, dirty-5)]

我可以看到nsubj(dirty-5,room-3),但我没有完整的名词短语作为支配者。

我希望我足够清楚。感谢您的帮助。

类型化的依赖项do表明形容词"脏"适用于"健身室":

det(room-3, The-1)
nn(room-3, fitness-2)
nsubj(dirty-5, room-3)
cop(dirty-5, was-4)
root(ROOT-0, dirty-5)

"nn"标记是名词复合修饰符,表示"fitness"是"room"的修饰符。

您可以在斯坦福依赖性手册中找到有关依赖性标签的详细信息。

修改方法

Collection<TypedDependency> tdl = gs.typedDependenciesCollapsed(); with
Collection<TypedDependency> tdl = gs.typedDependenciesCCprocessed();
or
Collection<TypedDependency> tdl = gs.allDependencies(); 

最新更新