正在提取文本OpenCV轮廓

我尝试使用tesseract对每个单独的轮廓进行ocr，但没有从中获得正确的文本。轮廓识别通过使用提取文本OpenCV正确完成。请提出建议。

由于图像预处理不好，您无法从OCR中获得正确的文本。尝试各种图像处理技术，为您的图像找到一种可行的方法。正如你在python下所问的，如果你有彩色图像，

将其转换为黑白图像，以去除颜色噪声。

img=cv2.imread('名称_of_the_colored_input_image'，0(
使用opencv的模糊技术(平均、高斯模糊、中值模糊和双边滤波(对图像进行模糊处理，从而减少图像中的各种噪声。请参阅此链接并尝试各种技术
然后使用阈值处理(简单、自适应或otsu阈值处理(，它可以去除所有小于某个阈值的像素。请参阅此链接并尝试各种技术

现在，获取轮廓并尝试在轮廓上使用tesseract以获得更好的结果。

注意：请记住，要使tesseract工作，您应该在白色背景下使用黑色文本。

请检查下面的函数，如果缺少什么，请告诉我。

#gray out the image
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY)
cv2.imshow('gray', gray)
cv2.waitKey(0)
#image blurring
blur = cv2.blur(gray,(1,1))
cv2.imshow('Blur', blur)
cv2.waitKey(0)
#threshold & invert
ret, thresh = cv2.threshold(blur, 127, 255, cv2.THRESH_BINARY_INV)
thresh_copy = thresh.copy()
cv2.imshow("Threshold", thresh_copy)
cv2.waitKey(0)
#Erosion
kernel1 = np.ones((1,1), np.uint8)
img_erosion = cv2.erode(thresh, kernel1, iterations=1)
cv2.imshow("Erosion", img_erosion.copy())
cv2.waitKey(0)
#applying dilation
kernel = np.ones((6,10), np.uint8)
img_dilation = cv2.dilate(img_erosion.copy(), kernel, iterations=1)
cv2.imshow("Dilation", img_dilation)
cv2.waitKey(0)
#find contours
im2, ctrs, hier = cv2.findContours(img_dilation.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
return ctrs

相关内容

最新更新

热门标签：