ValueError:找不到hazm库-python NLP的stanford-stagger.jar文件



我想运行一个需要stanford postagger.jar的代码。但我有这个错误:

File "/usr/lib/python2.7/site-packages/nltk/internals.py", line 562, in find_jar
(name, path_to_jar))
ValueError: Could not find stanford-postagger.jar jar file at resources/stanford-postagger.jar

如何修复此错误?

编辑:我从hazm模块使用:

from hazm import POSTagger
tagger = POSTagger()
tagger.tag(word_tokenize('ما بسیار کتاب می‌خوانیم'))

和完整结果:

Traceback (most recent call last):
File "pyt.py", line 8, in <module>
tagger = POSTagger()
File "/home/vahid/dev/hazm/hazm/POSTagger.py", line 14, in __init__
super(stanford.POSTagger, self).__init__(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/nltk/tag/stanford.py", line 42, in __init__
verbose=verbose)
File "/usr/lib/python2.7/site-packages/nltk/internals.py", line 562, in find_jar
(name, path_to_jar))
ValueError: Could not find stanford-postagger.jar jar file at resources/stanford-postagger.jar

您首先需要来自stanford的postagger.jar文件,还需要训练自己的标记程序。但是hazm开发人员已经友好地上传了您需要的资源目录:http://dl.dropboxusercontent.com/u/90405495/resources.zip

您需要将文件夹解压缩并保存到运行脚本的目录中。

例如:

$ mkdir testdir
$ wget https://github.com/sobhe/hazm/archive/master.zip
$ unzip master.zip -d testdir
$ cd testdir
$ mv hazm-master/hazm/ .
$ wget http://dl.dropboxusercontent.com/u/90405495/resources.zip
$ unzip resources.zip -d .
$ python
Python 2.7.5+ (default, Sep 19 2013, 13:48:49) 
[GCC 4.8.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import hazm
>>> tagger = hazm.POSTagger()
>>> tagger.tag(hazm.word_tokenize(u'ما بسیار کتاب می‌خوانیم'))
[(u'u0645u0627', u'PR'), (u'u0628u0633u06ccu0627u0631', u'ADV'), (u'u06a9u062au0627u0628', u'N'), (u'u0645u06ccu200cu062eu0648u0627u0646u06ccu0645', u'V')]

您只需要确保
1-您安装了java
2-您已经安装了JDK
3-将java PATH添加到环境变量
4-将JDK PATH添加到上下文
5-在上下文中设置java_HOME变量

最新更新