C++ 提取文本 OpenCV

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/23506105/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-28 00:28:28  来源:igfitidea点击:

Extracting text OpenCV

c++opencvimage-processingtextbounding-box

提问by Clip

I am trying to find the bounding boxes of text in an image and am currently using this approach:

我试图在图像中找到文本的边界框,目前正在使用这种方法:

// calculate the local variances of the grayscale image
Mat t_mean, t_mean_2;
Mat grayF;
outImg_gray.convertTo(grayF, CV_32F);
int winSize = 35;
blur(grayF, t_mean, cv::Size(winSize,winSize));
blur(grayF.mul(grayF), t_mean_2, cv::Size(winSize,winSize));
Mat varMat = t_mean_2 - t_mean.mul(t_mean);
varMat.convertTo(varMat, CV_8U);

// threshold the high variance regions
Mat varMatRegions = varMat > 100;

When given an image like this:

当给出这样的图像时:

enter image description here

在此处输入图片说明

Then when I show varMatRegionsI get this image:

然后当我展示时,varMatRegions我得到了这张图片:

enter image description here

在此处输入图片说明

As you can see it somewhat combines the left block of text with the header of the card, for most cards this method works great but on busier cards it can cause problems.

正如您所看到的,它在某种程度上将左侧的文本块与卡片的标题结合在一起,对于大多数卡片,这种方法效果很好,但在较繁忙的卡片上,它可能会导致问题。

The reason it is bad for those contours to connect is that it makes the bounding box of the contour nearly take up the entire card.

那些轮廓连接不好的原因是它使轮廓的边界框几乎占据了整个卡片。

Can anyone suggest a different way I can find the text to ensure proper detection of text?

任何人都可以提出一种不同的方法来找到文本以确保正确检测文本吗?

200 points to whoever can find the text in the card above the these two.

200 分给谁能在这两个上面找到卡片中的文字。

enter image description hereenter image description here

在此处输入图片说明在此处输入图片说明

采纳答案by LovaBill

You can detect text by finding close edge elements (inspired from a LPD):

您可以通过查找关闭边缘元素(受 LPD 启发)来检测文本:

#include "opencv2/opencv.hpp"

std::vector<cv::Rect> detectLetters(cv::Mat img)
{
    std::vector<cv::Rect> boundRect;
    cv::Mat img_gray, img_sobel, img_threshold, element;
    cvtColor(img, img_gray, CV_BGR2GRAY);
    cv::Sobel(img_gray, img_sobel, CV_8U, 1, 0, 3, 1, 0, cv::BORDER_DEFAULT);
    cv::threshold(img_sobel, img_threshold, 0, 255, CV_THRESH_OTSU+CV_THRESH_BINARY);
    element = getStructuringElement(cv::MORPH_RECT, cv::Size(17, 3) );
    cv::morphologyEx(img_threshold, img_threshold, CV_MOP_CLOSE, element); //Does the trick
    std::vector< std::vector< cv::Point> > contours;
    cv::findContours(img_threshold, contours, 0, 1); 
    std::vector<std::vector<cv::Point> > contours_poly( contours.size() );
    for( int i = 0; i < contours.size(); i++ )
        if (contours[i].size()>100)
        { 
            cv::approxPolyDP( cv::Mat(contours[i]), contours_poly[i], 3, true );
            cv::Rect appRect( boundingRect( cv::Mat(contours_poly[i]) ));
            if (appRect.width>appRect.height) 
                boundRect.push_back(appRect);
        }
    return boundRect;
}

Usage:

用法:

int main(int argc,char** argv)
{
    //Read
    cv::Mat img1=cv::imread("side_1.jpg");
    cv::Mat img2=cv::imread("side_2.jpg");
    //Detect
    std::vector<cv::Rect> letterBBoxes1=detectLetters(img1);
    std::vector<cv::Rect> letterBBoxes2=detectLetters(img2);
    //Display
    for(int i=0; i< letterBBoxes1.size(); i++)
        cv::rectangle(img1,letterBBoxes1[i],cv::Scalar(0,255,0),3,8,0);
    cv::imwrite( "imgOut1.jpg", img1);  
    for(int i=0; i< letterBBoxes2.size(); i++)
        cv::rectangle(img2,letterBBoxes2[i],cv::Scalar(0,255,0),3,8,0);
    cv::imwrite( "imgOut2.jpg", img2);  
    return 0;
}

Results:

结果:

a. element = getStructuringElement(cv::MORPH_RECT, cv::Size(17, 3) ); imgOut1imgOut2

一种。元素 = getStructuringElement(cv::MORPH_RECT, cv::Size(17, 3)); 图像输出1图像输出2

b. element = getStructuringElement(cv::MORPH_RECT, cv::Size(30, 30) ); imgOut1imgOut2

湾 element = getStructuringElement(cv::MORPH_RECT, cv::Size(30, 30)); 图像输出1图像输出2

Results are similar for the other image mentioned.

提到的其他图像的结果相似。

回答by dhanushka

I used a gradient based method in the program below. Added the resulting images. Please note that I'm using a scaled down version of the image for processing.

我在下面的程序中使用了基于梯度的方法。添加了生成的图像。请注意,我使用的是缩小版本的图像进行处理。

c++ version

C++版本

The MIT License (MIT)

Copyright (c) 2014 Dhanushka Dangampola

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

#include "stdafx.h"

#include <opencv2/core/core.hpp>
#include <opencv2/highgui/highgui.hpp>
#include <opencv2/imgproc/imgproc.hpp>
#include <iostream>

using namespace cv;
using namespace std;

#define INPUT_FILE              "1.jpg"
#define OUTPUT_FOLDER_PATH      string("")

int _tmain(int argc, _TCHAR* argv[])
{
    Mat large = imread(INPUT_FILE);
    Mat rgb;
    // downsample and use it for processing
    pyrDown(large, rgb);
    Mat small;
    cvtColor(rgb, small, CV_BGR2GRAY);
    // morphological gradient
    Mat grad;
    Mat morphKernel = getStructuringElement(MORPH_ELLIPSE, Size(3, 3));
    morphologyEx(small, grad, MORPH_GRADIENT, morphKernel);
    // binarize
    Mat bw;
    threshold(grad, bw, 0.0, 255.0, THRESH_BINARY | THRESH_OTSU);
    // connect horizontally oriented regions
    Mat connected;
    morphKernel = getStructuringElement(MORPH_RECT, Size(9, 1));
    morphologyEx(bw, connected, MORPH_CLOSE, morphKernel);
    // find contours
    Mat mask = Mat::zeros(bw.size(), CV_8UC1);
    vector<vector<Point>> contours;
    vector<Vec4i> hierarchy;
    findContours(connected, contours, hierarchy, CV_RETR_CCOMP, CV_CHAIN_APPROX_SIMPLE, Point(0, 0));
    // filter contours
    for(int idx = 0; idx >= 0; idx = hierarchy[idx][0])
    {
        Rect rect = boundingRect(contours[idx]);
        Mat maskROI(mask, rect);
        maskROI = Scalar(0, 0, 0);
        // fill the contour
        drawContours(mask, contours, idx, Scalar(255, 255, 255), CV_FILLED);
        // ratio of non-zero pixels in the filled region
        double r = (double)countNonZero(maskROI)/(rect.width*rect.height);

        if (r > .45 /* assume at least 45% of the area is filled if it contains text */
            && 
            (rect.height > 8 && rect.width > 8) /* constraints on region size */
            /* these two conditions alone are not very robust. better to use something 
            like the number of significant peaks in a horizontal projection as a third condition */
            )
        {
            rectangle(rgb, rect, Scalar(0, 255, 0), 2);
        }
    }
    imwrite(OUTPUT_FOLDER_PATH + string("rgb.jpg"), rgb);

    return 0;
}

python version

蟒蛇版

The MIT License (MIT)

Copyright (c) 2017 Dhanushka Dangampola

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

import cv2
import numpy as np

large = cv2.imread('1.jpg')
rgb = cv2.pyrDown(large)
small = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)

kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
grad = cv2.morphologyEx(small, cv2.MORPH_GRADIENT, kernel)

_, bw = cv2.threshold(grad, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)

kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 1))
connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, kernel)
# using RETR_EXTERNAL instead of RETR_CCOMP
contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
#For opencv 3+ comment the previous line and uncomment the following line
#_, contours, hierarchy = cv2.findContours(connected.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)

mask = np.zeros(bw.shape, dtype=np.uint8)

for idx in range(len(contours)):
    x, y, w, h = cv2.boundingRect(contours[idx])
    mask[y:y+h, x:x+w] = 0
    cv2.drawContours(mask, contours, idx, (255, 255, 255), -1)
    r = float(cv2.countNonZero(mask[y:y+h, x:x+w])) / (w * h)

    if r > 0.45 and w > 8 and h > 8:
        cv2.rectangle(rgb, (x, y), (x+w-1, y+h-1), (0, 255, 0), 2)

cv2.imshow('rects', rgb)

enter image description hereenter image description hereenter image description here

在此处输入图片说明在此处输入图片说明在此处输入图片说明

回答by anana

Here is an alternative approach that I used to detect the text blocks:

这是我用来检测文本块的另一种方法:

  1. Converted the image to grayscale
  2. Applied threshold(simple binary threshold, with a handpicked value of 150 as the threshold value)
  3. Applied dilationto thicken lines in image, leading to more compact objects and less white space fragments. Used a high value for number of iterations, so dilation is very heavy (13 iterations, also handpicked for optimal results).
  4. Identified contours of objects in resulted image using opencv findContoursfunction.
  5. Drew a bounding box(rectangle) circumscribing each contoured object - each of them frames a block of text.
  6. Optionally discarded areas that are unlikely to be the object you are searching for (e.g. text blocks) given their size, as the algorithm above can also find intersecting or nested objects (like the entire top area for the first card) some of which could be uninteresting for your purposes.
  1. 将图像转换为灰度
  2. 应用阈值(简单的二元阈值,以精心挑选的值150作为阈值)
  3. 应用膨胀来加粗图像中的线条,导致更紧凑的对象和更少的空白碎片。对迭代次数使用了高值,因此膨胀非常重(13 次迭代,也是为获得最佳结果而精心挑选的)。
  4. 使用 opencv findContours函数识别结果图像中对象的轮廓。
  5. 绘制一个包围每个轮廓对象的边界框(矩形)——它们中的每一个都框住了一个文本块。
  6. 可选地丢弃不太可能是您正在搜索的对象的区域(例如文本块),因为上面的算法还可以找到相交或嵌套的对象(如第一张卡片的整个顶部区域),其中一些可能是对你的目的无趣。

Below is the code written in python with pyopencv, it should easy to port to C++.

下面是用 pyopencv 用 python 编写的代码,它应该很容易移植到 C++。

import cv2

image = cv2.imread("card.png")
gray = cv2.cvtColor(image,cv2.COLOR_BGR2GRAY) # grayscale
_,thresh = cv2.threshold(gray,150,255,cv2.THRESH_BINARY_INV) # threshold
kernel = cv2.getStructuringElement(cv2.MORPH_CROSS,(3,3))
dilated = cv2.dilate(thresh,kernel,iterations = 13) # dilate
_, contours, hierarchy = cv2.findContours(dilated,cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_NONE) # get contours

# for each contour found, draw a rectangle around it on original image
for contour in contours:
    # get rectangle bounding contour
    [x,y,w,h] = cv2.boundingRect(contour)

    # discard areas that are too large
    if h>300 and w>300:
        continue

    # discard areas that are too small
    if h<40 or w<40:
        continue

    # draw rectangle around contour on original image
    cv2.rectangle(image,(x,y),(x+w,y+h),(255,0,255),2)

# write original image with added contours to disk  
cv2.imwrite("contoured.jpg", image) 

The original image is the first image in your post.

原始图片是您帖子中的第一张图片。

After preprocessing (grayscale, threshold and dilate - so after step 3) the image looked like this:

预处理后(灰度、阈值和膨胀 - 所以在第 3 步之后)图像如下所示:

Dilated image

膨胀的图像

Below is the resulted image ("contoured.jpg" in the last line); the final bounding boxes for the objects in the image look like this:

下面是结果图像(最后一行中的“contoured.jpg”);图像中对象的最终边界框如下所示:

enter image description here

在此处输入图片说明

You can see the text block on the left is detected as a separate block, delimited from its surroundings.

您可以看到左侧的文本块被检测为一个单独的块,与其周围环境分隔开来。

Using the same script with the same parameters (except for thresholding type that was changed for the second image like described below), here are the results for the other 2 cards:

使用具有相同参数的相同脚本(除了如下所述为第二张图像更改的阈值类型),以下是其他两张卡片的结果:

enter image description here

在此处输入图片说明

enter image description here

在此处输入图片说明

Tuning the parameters

调整参数

The parameters (threshold value, dilation parameters) were optimized for this image and this task (finding text blocks) and can be adjusted, if needed, for other cards images or other types of objects to be found.

参数(阈值、膨胀参数)针对此图像和此任务(查找文本块)进行了优化,如果需要,可以针对其他卡片图像或其他类型的对象进行调整。

For thresholding (step 2), I used a black threshold. For images where text is lighter than the background, such as the second image in your post, a white threshold should be used, so replace thesholding type with cv2.THRESH_BINARY). For the second image I also used a slightly higher value for the threshold (180). Varying the parameters for the threshold value and the number of iterations for dilation will result in different degrees of sensitivity in delimiting objects in the image.

对于阈值(第 2 步),我使用了黑色阈值。对于文本比背景更亮的图像,例如帖子中的第二张图像,应使用白色阈值,因此用cv2.THRESH_BINARY)替换 thesholding 类型。对于第二张图像,我还使用了稍高的阈值 (180)。改变阈值的参数和膨胀的迭代次数将导致在划定图像中的对象时具有不同程度的敏感度。

Finding other object types:

查找其他对象类型:

For example, decreasing the dilation to 5 iterations in the first image gives us a more fine delimitation of objects in the image, roughly finding all wordsin the image (rather than text blocks):

例如,将第一个图像中的膨胀减少到 5 次迭代,可以让我们对图像中的对象进行更精细的定界,粗略地找到图像中的所有单词(而不是文本块):

enter image description here

在此处输入图片说明

Knowing the rough size of a word, here I discarded areas that were too small (below 20 pixels width or height) or too large (above 100 pixels width or height) to ignore objects that are unlikely to be words, to get the results in the above image.

知道单词的粗略大小,这里我丢弃了太小(宽度或高度低于 20 像素)或太大(宽度或高度超过 100 像素)的区域以忽略不太可能是单词的对象,以获得结果上图。

回答by rtkaleta

@dhanushka's approach showed the most promise but I wanted to play around in Python so went ahead and translated it for fun:

@dhanushka 的方法显示出最大的希望,但我想在 Python 中玩,所以继续翻译它以获得乐趣:

import cv2
import numpy as np
from cv2 import boundingRect, countNonZero, cvtColor, drawContours, findContours, getStructuringElement, imread, morphologyEx, pyrDown, rectangle, threshold

large = imread(image_path)
# downsample and use it for processing
rgb = pyrDown(large)
# apply grayscale
small = cvtColor(rgb, cv2.COLOR_BGR2GRAY)
# morphological gradient
morph_kernel = getStructuringElement(cv2.MORPH_ELLIPSE, (3, 3))
grad = morphologyEx(small, cv2.MORPH_GRADIENT, morph_kernel)
# binarize
_, bw = threshold(src=grad, thresh=0, maxval=255, type=cv2.THRESH_BINARY+cv2.THRESH_OTSU)
morph_kernel = getStructuringElement(cv2.MORPH_RECT, (9, 1))
# connect horizontally oriented regions
connected = morphologyEx(bw, cv2.MORPH_CLOSE, morph_kernel)
mask = np.zeros(bw.shape, np.uint8)
# find contours
im2, contours, hierarchy = findContours(connected, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
# filter contours
for idx in range(0, len(hierarchy[0])):
    rect = x, y, rect_width, rect_height = boundingRect(contours[idx])
    # fill the contour
    mask = drawContours(mask, contours, idx, (255, 255, 2555), cv2.FILLED)
    # ratio of non-zero pixels in the filled region
    r = float(countNonZero(mask)) / (rect_width * rect_height)
    if r > 0.45 and rect_height > 8 and rect_width > 8:
        rgb = rectangle(rgb, (x, y+rect_height), (x+rect_width, y), (0,255,0),3)

Now to display the image:

现在显示图像:

from PIL import Image
Image.fromarray(rgb).show()

Not the most Pythonic of scripts but I tried to resemble the original C++ code as closely as possible for readers to follow.

不是最 Pythonic 的脚本,但我试图尽可能接近原始的 C++ 代码,以便读者遵循。

It works almost as well as the original. I'll be happy to read suggestions how it could be improved/fixed to resemble the original results fully.

它几乎和原版一样好用。我很乐意阅读有关如何改进/修复它以完全类似于原始结果的建议。

enter image description here

在此处输入图片说明

enter image description here

在此处输入图片说明

enter image description here

在此处输入图片说明

回答by herohuyongtao

You can try this methodthat is developed by Chucai Yi and Yingli Tian.

你可以试试这个由 Chucai Yi 和 Yingli Tian 开发的方法

They also share a software (which is based on Opencv-1.0 and it should run under Windows platform.) that you can use (though no source code available). It will generate all the text bounding boxes (shown in color shadows) in the image. By applying to your sample images, you will get the following results:

他们还共享一个您可以使用的软件(基于 Opencv-1.0,它应该在 Windows 平台下运行。)(尽管没有可用的源代码)。它将生成图像中的所有文本边界框(以彩色阴影显示)。通过应用到您的示例图像,您将获得以下结果:

Note: to make the result more robust, you can further merge adjacent boxes together.

注意:为了使结果更健壮,您可以进一步将相邻的框合并在一起。



Update:If your ultimate goal is to recognize the texts in the image, you can further check out gttext, which is an OCR free software and Ground Truthing tool for Color Images with Text. Source code is also available.

更新:如果您的最终目标是识别图像中的文本,您可以进一步查看gttext,它是一个免费的 OCR 软件和用于带文本的彩色图像的 Ground Truthing 工具。源代码也可用。

With this, you can get recognized texts like:

有了这个,您可以获得识别的文本,例如:

回答by Fariz Agayev

Above Code JAVA version: Thanks @William

以上代码 JAVA 版本:谢谢@William

public static List<Rect> detectLetters(Mat img){    
    List<Rect> boundRect=new ArrayList<>();

    Mat img_gray =new Mat(), img_sobel=new Mat(), img_threshold=new Mat(), element=new Mat();
    Imgproc.cvtColor(img, img_gray, Imgproc.COLOR_RGB2GRAY);
    Imgproc.Sobel(img_gray, img_sobel, CvType.CV_8U, 1, 0, 3, 1, 0, Core.BORDER_DEFAULT);
    //at src, Mat dst, double thresh, double maxval, int type
    Imgproc.threshold(img_sobel, img_threshold, 0, 255, 8);
    element=Imgproc.getStructuringElement(Imgproc.MORPH_RECT, new Size(15,5));
    Imgproc.morphologyEx(img_threshold, img_threshold, Imgproc.MORPH_CLOSE, element);
    List<MatOfPoint> contours = new ArrayList<MatOfPoint>();
    Mat hierarchy = new Mat();
    Imgproc.findContours(img_threshold, contours,hierarchy, 0, 1);

    List<MatOfPoint> contours_poly = new ArrayList<MatOfPoint>(contours.size());

     for( int i = 0; i < contours.size(); i++ ){             

         MatOfPoint2f  mMOP2f1=new MatOfPoint2f();
         MatOfPoint2f  mMOP2f2=new MatOfPoint2f();

         contours.get(i).convertTo(mMOP2f1, CvType.CV_32FC2);
         Imgproc.approxPolyDP(mMOP2f1, mMOP2f2, 2, true); 
         mMOP2f2.convertTo(contours.get(i), CvType.CV_32S);


            Rect appRect = Imgproc.boundingRect(contours.get(i));
            if (appRect.width>appRect.height) {
                boundRect.add(appRect);
            }
     }

    return boundRect;
}

And use this code in practice :

并在实践中使用此代码:

        System.loadLibrary(Core.NATIVE_LIBRARY_NAME);
        Mat img1=Imgcodecs.imread("abc.png");
        List<Rect> letterBBoxes1=Utils.detectLetters(img1);

        for(int i=0; i< letterBBoxes1.size(); i++)
            Imgproc.rectangle(img1,letterBBoxes1.get(i).br(), letterBBoxes1.get(i).tl(),new Scalar(0,255,0),3,8,0);         
        Imgcodecs.imwrite("abc1.png", img1);

回答by Akash Budhia

Python Implementation for @dhanushka's solution:

@dhanushka 解决方案的 Python 实现:

def process_rgb(rgb):
    hasText = False
    gray = cv2.cvtColor(rgb, cv2.COLOR_BGR2GRAY)
    morphKernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
    grad = cv2.morphologyEx(gray, cv2.MORPH_GRADIENT, morphKernel)
    # binarize
    _, bw = cv2.threshold(grad, 0.0, 255.0, cv2.THRESH_BINARY | cv2.THRESH_OTSU)
    # connect horizontally oriented regions
    morphKernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 1))
    connected = cv2.morphologyEx(bw, cv2.MORPH_CLOSE, morphKernel)
    # find contours
    mask = np.zeros(bw.shape[:2], dtype="uint8")
    _,contours, hierarchy = cv2.findContours(connected, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_SIMPLE)
    # filter contours
    idx = 0
    while idx >= 0:
        x,y,w,h = cv2.boundingRect(contours[idx])
        # fill the contour
        cv2.drawContours(mask, contours, idx, (255, 255, 255), cv2.FILLED)
        # ratio of non-zero pixels in the filled region
        r = cv2.contourArea(contours[idx])/(w*h)
        if(r > 0.45 and h > 5 and w > 5 and w > h):
            cv2.rectangle(rgb, (x,y), (x+w,y+h), (0, 255, 0), 2)
            hasText = True
        idx = hierarchy[0][idx][0]
    return hasText, rgb

回答by Diomedes Domínguez

This is a C# version of the answerfrom dhanushka using OpenCVSharp

这是使用OpenCVSharp 的 dhanushka答案的 C# 版本

        Mat large = new Mat(INPUT_FILE);
        Mat rgb = new Mat(), small = new Mat(), grad = new Mat(), bw = new Mat(), connected = new Mat();

        // downsample and use it for processing
        Cv2.PyrDown(large, rgb);
        Cv2.CvtColor(rgb, small, ColorConversionCodes.BGR2GRAY);

        // morphological gradient
        var morphKernel = Cv2.GetStructuringElement(MorphShapes.Ellipse, new OpenCvSharp.Size(3, 3));
        Cv2.MorphologyEx(small, grad, MorphTypes.Gradient, morphKernel);

        // binarize
        Cv2.Threshold(grad, bw, 0, 255, ThresholdTypes.Binary | ThresholdTypes.Otsu);

        // connect horizontally oriented regions
        morphKernel = Cv2.GetStructuringElement(MorphShapes.Rect, new OpenCvSharp.Size(9, 1));
        Cv2.MorphologyEx(bw, connected, MorphTypes.Close, morphKernel);

        // find contours
        var mask = new Mat(Mat.Zeros(bw.Size(), MatType.CV_8UC1), Range.All);
        Cv2.FindContours(connected, out OpenCvSharp.Point[][] contours, out HierarchyIndex[] hierarchy, RetrievalModes.CComp, ContourApproximationModes.ApproxSimple, new OpenCvSharp.Point(0, 0));

        // filter contours
        var idx = 0;
        foreach (var hierarchyItem in hierarchy)
        {
            idx = hierarchyItem.Next;
            if (idx < 0)
                break;
            OpenCvSharp.Rect rect = Cv2.BoundingRect(contours[idx]);
            var maskROI = new Mat(mask, rect);
            maskROI.SetTo(new Scalar(0, 0, 0));

            // fill the contour
            Cv2.DrawContours(mask, contours, idx, Scalar.White, -1);

            // ratio of non-zero pixels in the filled region
            double r = (double)Cv2.CountNonZero(maskROI) / (rect.Width * rect.Height);
            if (r > .45 /* assume at least 45% of the area is filled if it contains text */
                 &&
            (rect.Height > 8 && rect.Width > 8) /* constraints on region size */
            /* these two conditions alone are not very robust. better to use something 
            like the number of significant peaks in a horizontal projection as a third condition */
            )
            {
                Cv2.Rectangle(rgb, rect, new Scalar(0, 255, 0), 2);
            }
        }

        rgb.SaveImage(Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "rgb.jpg"));

回答by Hakan Usakli

this is a VB.NET version of the answerfrom dhanushka using EmguCV.

这是dhanushka 使用EmguCV答案的 VB.NET 版本。

A few functions and structures in EmguCV need different consideration than the C# version with OpenCVSharp

EmguCV 中的一些函数和结构需要与 OpenCVSharp 的 C# 版本不同的考虑

Imports Emgu.CV
Imports Emgu.CV.Structure
Imports Emgu.CV.CvEnum
Imports Emgu.CV.Util

        Dim input_file As String = "C:\your_input_image.png"
        Dim large As Mat = New Mat(input_file)
        Dim rgb As New Mat
        Dim small As New Mat
        Dim grad As New Mat
        Dim bw As New Mat
        Dim connected As New Mat
        Dim morphanchor As New Point(0, 0)

        '//downsample and use it for processing
        CvInvoke.PyrDown(large, rgb)
        CvInvoke.CvtColor(rgb, small, ColorConversion.Bgr2Gray)

        '//morphological gradient
        Dim morphKernel As Mat = CvInvoke.GetStructuringElement(ElementShape.Ellipse, New Size(3, 3), morphanchor)
        CvInvoke.MorphologyEx(small, grad, MorphOp.Gradient, morphKernel, New Point(0, 0), 1, BorderType.Isolated, New MCvScalar(0))

        '// binarize
        CvInvoke.Threshold(grad, bw, 0, 255, ThresholdType.Binary Or ThresholdType.Otsu)

        '// connect horizontally oriented regions
        morphKernel = CvInvoke.GetStructuringElement(ElementShape.Rectangle, New Size(9, 1), morphanchor)
        CvInvoke.MorphologyEx(bw, connected, MorphOp.Close, morphKernel, morphanchor, 1, BorderType.Isolated, New MCvScalar(0))

        '// find contours
        Dim mask As Mat = Mat.Zeros(bw.Size.Height, bw.Size.Width, DepthType.Cv8U, 1)  '' MatType.CV_8UC1
        Dim contours As New VectorOfVectorOfPoint
        Dim hierarchy As New Mat

        CvInvoke.FindContours(connected, contours, hierarchy, RetrType.Ccomp, ChainApproxMethod.ChainApproxSimple, Nothing)

        '// filter contours
        Dim idx As Integer
        Dim rect As Rectangle
        Dim maskROI As Mat
        Dim r As Double
        For Each hierarchyItem In hierarchy.GetData
            rect = CvInvoke.BoundingRectangle(contours(idx))
            maskROI = New Mat(mask, rect)
            maskROI.SetTo(New MCvScalar(0, 0, 0))

            '// fill the contour
            CvInvoke.DrawContours(mask, contours, idx, New MCvScalar(255), -1)

            '// ratio of non-zero pixels in the filled region
            r = CvInvoke.CountNonZero(maskROI) / (rect.Width * rect.Height)

            '/* assume at least 45% of the area Is filled if it contains text */
            '/* constraints on region size */
            '/* these two conditions alone are Not very robust. better to use something 
            'Like the number of significant peaks in a horizontal projection as a third condition */
            If r > 0.45 AndAlso rect.Height > 8 AndAlso rect.Width > 8 Then
                'draw green rectangle
                CvInvoke.Rectangle(rgb, rect, New MCvScalar(0, 255, 0), 2)
            End If
            idx += 1
        Next
        rgb.Save(IO.Path.Combine(Application.StartupPath, "rgb.jpg"))