我想从一堆以图像形式存储的表中提取数据-
导入tesseract时提示安装Qhull(按照文档http://pytesseract.readthedocs.io/en/latest/tutorials.html)
代码-
> import Image
> from tesseract import image_to_string
> print image_to_string(Image.open('test.png')) print
> image_to_string(Image.open('test-english.jpg'), lang='eng')
我得到以下提示,但我无法正确输入目录-
Please enter the path to an existing directory where qhull should be installed:
我试着在引号中给出目录,也通过变量,但它一直给我无效目录错误
这应该是非常直接的,但我就是不明白。
用pytesseract
代替
pip install pytesseract
You Need to change some lines of codes in
C:Python27Libsite-packagestesseractvoro.py file
in line after
# Qhull installation
if config_parser.has_option('qhull','install-dir'):
_qhulldir = config_parser.get('qhull','install-dir').strip()
else:
# Ask user for qhull directory
## qstr = 'Please enter the path to an existing directory where qhull should be installed:
qstr = 'C:/Python27/Lib/site-packages/tesseract'
## _qhulldir = os.path.expanduser(raw_input(qstr).strip())
_qhulldir = os.path.expanduser(qstr)