级联协议缓冲区的视图文件:Stanford CoreNLP



我正在尝试复制一篇使用stanford核心NLP的论文的结果,它们在文档中声明:

the fully annotated sentences are provided in a file of concatenated
  protocol buffers:
delimitedSentences.proto.bz
This file should be read with the Java function
  `CoreNLPProtos.Sentence.parseDelimitedFrom(<input stream>)`,
  or in other languages taking into consideration that every protocol buffer is
  prepended with the size of the buffer, as a VarInt.
Each proto contains all the annotations for the MIML-RE featurizer, in addition to
  some useful additions (e.g., antecedent for every token).

我已经在代码中搜索了CoreNLPProtos.Sentence.parseDelimitedFrom(<input stream>)函数,但找不到它。

我对质子不是很熟悉。

我该怎么办?

希望这些将出现在CoreNLP的下一个版本中——同时,该文件位于公共GitHub上:https://github.com/stanfordnlp/CoreNLP/blob/master/src/edu/stanford/nlp/pipeline/CoreNLPProtos.java

如果您在使用数据时遇到其他问题,请告诉我!我可以在出现错误时修复它们,所以希望这个过程对未来的用户来说更顺利。

最新更新