在Stanford NLP中配置单独的型号JAR



我已经实现了一种逻辑来使用斯坦福NLP从特定的英语句子中获取位置。我正在使用以下罐子Stanford-Corenlp-3.2.0.Jarstanford-corenlp-3.2.0-models.jar

我写的逻辑正在遵循

 public static edu.stanford.nlp.pipeline.StanfordCoreNLP snlp;
    /**
     * @see ServletContextListener#contextInitialized(ServletContextEvent)
     */
    public void contextInitialized(ServletContextEvent arg0) {
        Properties props = new Properties();
        props.put("annotators", "tokenize,ssplit,pos,lemma,parse,ner,dcoref");
        StanfordCoreNLP snlp = new StanfordCoreNLP(props);
    }

但是,由于案件敏感的问题,我被告知使用Stanford-Corenlp-Caseless-2015-04-20-Models.jar而不是Stanford-Corenlp-3.2.0.jar。从上面的代码中,默认将加载的jar是stanford-corenlp-3.2.0-models.jar。

但是,我现在想使用以下型号进行配置,即Stanford-Corenlp-Caseless-2015-04-20-Models.jar请指导我如何使用Java代码进行配置。

我尝试了Gabor的解决方案。但是我得到以下例外

SEVERE: Exception sending context initialized event to listener instance of class servlets.NLP_initializer
java.lang.RuntimeException: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:493)
    at edu.stanford.nlp.pipeline.AnnotatorPool.get(AnnotatorPool.java:81)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.construct(StanfordCoreNLP.java:260)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:127)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP.<init>(StanfordCoreNLP.java:123)
    at servlets.NLP_initializer.contextInitialized(NLP_initializer.java:34)
    at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4887)
    at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5381)
    at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150)
    at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1559)
    at org.apache.catalina.core.ContainerBase$StartChild.call(ContainerBase.java:1549)
    at java.util.concurrent.FutureTask.run(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
    at java.lang.Thread.run(Unknown Source)
Caused by: edu.stanford.nlp.io.RuntimeIOException: Unrecoverable error while loading a tagger model
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:749)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:283)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.<init>(MaxentTagger.java:247)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.loadModel(POSTaggerAnnotator.java:78)
    at edu.stanford.nlp.pipeline.POSTaggerAnnotator.<init>(POSTaggerAnnotator.java:62)
    at edu.stanford.nlp.pipeline.StanfordCoreNLP$4.create(StanfordCoreNLP.java:491)
    ... 14 more
Caused by: java.io.IOException: Unable to resolve "edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger" as either class path, filename or URL
    at edu.stanford.nlp.io.IOUtils.getInputStreamFromURLOrClasspathOrFileSystem(IOUtils.java:419)
    at edu.stanford.nlp.tagger.maxent.MaxentTagger.readModelAndInit(MaxentTagger.java:744)
    ... 19 more

请参阅http://nlp.stanford.edu/software/corenlp.shtml#caseless

从文档中复制:

可以使用忽略资本化的Tagger,Parser和NER模型来运行StanfordCorenlp。为此,请下载无壳模型包。请确保在-CP ClassPath Flag中还包括通往情况不敏感的模型jar的路径。然后,设置指向这些模型的属性如下:

-pos.model edu/stanford/nlp/models/pos-tagger/pos-caseless-caseless-left3words-distsim.tagger.tagger

-parse.model edu/stanford/nlp/models/lexparser/Englishpcfg.caseless.ser.gz

-ner.model edu/stanford/nlp/models/ner/English.all.3class.caseless.distsim.crf.ser.gz edu/stanford/nlp/models/ner/English.muc.7class.caseless.distsim.crf.ser.gz edu/stanford/nlp/models/ner/English.conll.4class.caseless.distsim.crf.ser.gz

在您的代码中,可以设置这些路径:

    props.put("pos.model", "edu/stanford/nlp/models/pos-tagger/english-caseless-left3words-distsim.tagger");
    props.put("parse.model", "edu/stanford/nlp/models/lexparser/englishPCFG.caseless.ser.gz");
    props.put("ner.model", "edu/stanford/nlp/models/ner/english.all.3class.caseless.distsim.crf.ser.gz edu/stanford/nlp/models/ner/english.muc.7class.caseless.distsim.crf.ser.gz edu/stanford/nlp/models/ner/english.conll.4class.caseless.distsim.crf.ser.gz");

最新更新