OpenCV MSER 检测文本区域 - Python

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/40078625/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-19 23:09:05  来源:igfitidea点击:

OpenCV MSER detect text areas - Python

pythonopencvimage-processingocr

提问by Amit Madan

I have an invoice image, and I want to detect the text on it. So I plan to use 2 steps: first is to identify the text areas, and then using OCR to recognize the text.

我有一张发票图片,我想检测上面的文字。所以我打算使用2个步骤:首先是识别文本区域,然后使用OCR识别文本。

I am using OpenCV 3.0 in python for that. I am able to identify the text(including some non text areas) but I further want to identify text boxes from the image(also excluding the non-text areas).

为此,我在 python 中使用 OpenCV 3.0。我能够识别文本(包括一些非文本区域),但我还想从图像中识别文本框(也不包括非文本区域)。

My input image is: Originaland the output is: Processedand I am using the below code for this:

我的输入图像是:原来的输出是:处理我为此使用以下代码:

img = cv2.imread('/home/mis/Text_Recognition/bill.jpg')
mser = cv2.MSER_create()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) #Converting to GrayScale
gray_img = img.copy()

regions = mser.detectRegions(gray, None)
hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]
cv2.polylines(gray_img, hulls, 1, (0, 0, 255), 2)
cv2.imwrite('/home/mis/Text_Recognition/amit.jpg', gray_img) #Saving

Now, I want to identify the text boxes, and remove/unidentify any non-text areas on the invoice. I am new to OpenCV and am a beginner in Python. I am able to find some examples in MATAB exampleand C++ example, but If I convert them to python, it will take a lot of time for me.

现在,我想识别文本框,并删除/取消识别发票上的任何非文本区域。我是 OpenCV 的新手,也是 Python 的初学者。我可以在MATAB 示例C++示例中找到一些示例,但是如果我将它们转换为 python,我将花费很多时间。

Is there any example with python using OpenCV, or can anyone help me with this?

有没有使用 OpenCV 的 python 示例,或者任何人都可以帮我解决这个问题?

回答by RAFI AFRIDI

Below is the code

下面是代码

# Import packages 
import cv2
import numpy as np

#Create MSER object
mser = cv2.MSER_create()

#Your image path i-e receipt path
img = cv2.imread('/home/rafiullah/PycharmProjects/python-ocr-master/receipts/73.jpg')

#Convert to gray scale
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

vis = img.copy()

#detect regions in gray scale image
regions, _ = mser.detectRegions(gray)

hulls = [cv2.convexHull(p.reshape(-1, 1, 2)) for p in regions]

cv2.polylines(vis, hulls, 1, (0, 255, 0))

cv2.imshow('img', vis)

cv2.waitKey(0)

mask = np.zeros((img.shape[0], img.shape[1], 1), dtype=np.uint8)

for contour in hulls:

    cv2.drawContours(mask, [contour], -1, (255, 255, 255), -1)

#this is used to find only text regions, remaining are ignored
text_only = cv2.bitwise_and(img, img, mask=mask)

cv2.imshow("text only", text_only)

cv2.waitKey(0)

回答by Shreyash Sharma

This is an old post, yet I'd like to contribute that if you are trying to extract all the texts out of an image, here is the code to get that text in an array.

这是一篇旧帖子,但我想贡献一下,如果您试图从图像中提取所有文本,这里是在数组中获取该文本的代码。

import cv2
import numpy as np
import re
import pytesseract
from pytesseract import image_to_string
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
from PIL import Image

image_obj = Image.open("screenshot.png")

rgb = cv2.imread('screenshot.png')
small = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)

#threshold the image
_, bw = cv2.threshold(small, 0.0, 255.0, cv2.THRESH_BINARY_INV | cv2.THRESH_OTSU)

# get horizontal mask of large size since text are horizontal components
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, kernel)

# find all the contours
contours, hierarchy,=cv2.findContours(connected.copy(),cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#Segment the text lines
counter=0
array_of_texts=[]
for idx in range(len(contours)):
    x, y, w, h = cv2.boundingRect(contours[idx])
    cropped_image = image_obj.crop((x-10, y, x+w+10, y+h ))
    str_store = re.sub(r'([^\s\w]|_)+', '', image_to_string(cropped_image))
    array_of_texts.append(str_store)
    counter+=1

print(array_of_texts)