OpenCv pytesseract for OCR

如何使用opencv和pytesseract从图像中提取文本?

import cv2

进口pytesseract从PIL导入图像导入numpy为np从matplotlib导入pyplot为plt

img = Image.open('test.jpg').convert('L')
img.show()
img.save('test','png')
img = cv2.imread('test.png',0)
edges = cv2.Canny(img,100,200)
#contour = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
#print pytesseract.image_to_string(Image.open(edges))
print pytesseract.image_to_string(edges)

但是这会给出错误-

Traceback(最近一次调用):文件"open.py"，第14行打印pytesseract.image_to_string(边缘)文件"/home/sroy8091/.local/lib/python2.7/site-packages/pytesseract/pytesseract.py"，第143行，在image_to_string .py中如果len(image.split()) == 4:AttributeError: 'NoneType'对象没有'split'属性

如果你喜欢使用opencv做一些预处理(就像你做了一些边缘检测)，然后如果你想提取文本，你可以使用这个命令，

# All the imports and other stuffs goes here
img = cv2.imread('test.png',0)
edges = cv2.Canny(img,100,200)
img_new = Image.fromarray(edges)
text = pytesseract.image_to_string(img_new, lang='eng')
print (text)

不能直接使用Opencv对象和tesseract方法。

试题:

from PIL import Image
from pytesseract import *
image_file = 'test.png'
print(pytesseract.image_to_string(Image.open(image_file)))

相关内容

最新更新

热门标签：