OpenCV MSER 检测文本区域 - Python
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/40078625/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
OpenCV MSER detect text areas - Python
提问by Amit Madan
I have an invoice image, and I want to detect the text on it. So I plan to use 2 steps: first is to identify the text areas, and then using OCR to recognize the text.
我有一张发票图片,我想检测上面的文字。所以我打算使用2个步骤:首先是识别文本区域,然后使用OCR识别文本。
I am using OpenCV 3.0 in python for that. I am able to identify the text(including some non text areas) but I further want to identify text boxes from the image(also excluding the non-text areas).
为此,我在 python 中使用 OpenCV 3.0。我能够识别文本(包括一些非文本区域),但我还想从图像中识别文本框(也不包括非文本区域)。
My input image is: and the output is:
and I am using the below code for this:
img = cv2.imread('/home/mis/Text_Recognition/bill.jpg')
mser = cv2.MSER_create()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
gray_img = img.copy()
regions = mser.detectRegions(gray, None)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
cv2.polylines(gray_img, hulls, 1, (0, 0, 255), 2)
cv2.imwrite('/home/mis/Text_Recognition/amit.jpg', gray_img) #Saving
Now, I want to identify the text boxes, and remove/unidentify any non-text areas on the invoice. I am new to OpenCV and am a beginner in Python. I am able to find some examples in MATAB exampleand C++ example, but If I convert them to python, it will take a lot of time for me.
现在,我想识别文本框,并删除/取消识别发票上的任何非文本区域。我是 OpenCV 的新手,也是 Python 的初学者。我可以在MATAB 示例和C++示例中找到一些示例,但是如果我将它们转换为 python,我将花费很多时间。
Is there any example with python using OpenCV, or can anyone help me with this?
有没有使用 OpenCV 的 python 示例,或者任何人都可以帮我解决这个问题?
回答by RAFI AFRIDI
Below is the code
下面是代码
# Import packages
import cv2
import numpy as np
#Create MSER object
mser = cv2.MSER_create()
#Your image path i-e receipt path
img = cv2.imread('/home/rafiullah/PycharmProjects/python-ocr-master/receipts/73.jpg')
#Convert to gray scale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
vis = img.copy()
#detect regions in gray scale image
regions, _ = mser.detectRegions(gray)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
cv2.polylines(vis, hulls, 1, (0, 255, 0))
cv2.imshow('img', vis)
cv2.waitKey(0)
mask = np.zeros((img.shape[0], img.shape[1], 1), dtype=np.uint8)
for contour in hulls:
cv2.drawContours(mask, [contour], -1, (255, 255, 255), -1)
#this is used to find only text regions, remaining are ignored
text_only = cv2.bitwise_and(img, img, mask=mask)
cv2.imshow("text only", text_only)
cv2.waitKey(0)
回答by Shreyash Sharma
This is an old post, yet I'd like to contribute that if you are trying to extract all the texts out of an image, here is the code to get that text in an array.
这是一篇旧帖子,但我想贡献一下,如果您试图从图像中提取所有文本,这里是在数组中获取该文本的代码。
import cv2
import numpy as np
import re
import pytesseract
from pytesseract import image_to_string
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
from PIL import Image
image_obj = Image.open("screenshot.png")
rgb = cv2.imread('screenshot.png')
small = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
#threshold the image
_, bw = cv2.threshold(small, 0.0, 255.0, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)
# get horizontal mask of large size since text are horizontal components
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, kernel)
# find all the contours
contours, hierarchy,=cv2.findContours(connected.copy(),cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#Segment the text lines
counter=0
array_of_texts=[]
for idx in range(len(contours)):
x, y, w, h = cv2.boundingRect(contours[idx])
cropped_image = image_obj.crop((x-10, y, x+w+10, y+h ))
str_store = re.sub(r'([^\s\w]|_)+', '', image_to_string(cropped_image))
array_of_texts.append(str_store)
counter+=1
print(array_of_texts)