OpenCV C++/Obj-C：检测一张纸/正方形检测

Question

提问by dom

I successfully implemented the OpenCV square-detection example in my test application, but now need to filter the output, because it's quite messy - or is my code wrong?

我在我的测试应用程序中成功实现了 OpenCV 正方形检测示例，但现在需要过滤输出，因为它非常混乱 - 还是我的代码有问题？

I'm interested in the four corner points of the paper for skew reduction (like that) and further processing?…

我对论文的四个角点感兴趣，用于减少歪斜（像那样）和进一步处理？...

Input & Output:

输入输出：

Original image:

原图：

click

点击

Code:

代码：

double angle( cv::Point pt1, cv::Point pt2, cv::Point pt0 ) {
    double dx1 = pt1.x - pt0.x;
    double dy1 = pt1.y - pt0.y;
    double dx2 = pt2.x - pt0.x;
    double dy2 = pt2.y - pt0.y;
    return (dx1*dx2 + dy1*dy2)/sqrt((dx1*dx1 + dy1*dy1)*(dx2*dx2 + dy2*dy2) + 1e-10);
}

- (std::vector<std::vector<cv::Point> >)findSquaresInImage:(cv::Mat)_image
{
    std::vector<std::vector<cv::Point> > squares;
    cv::Mat pyr, timg, gray0(_image.size(), CV_8U), gray;
    int thresh = 50, N = 11;
    cv::pyrDown(_image, pyr, cv::Size(_image.cols/2, _image.rows/2));
    cv::pyrUp(pyr, timg, _image.size());
    std::vector<std::vector<cv::Point> > contours;
    for( int c = 0; c < 3; c++ ) {
        int ch[] = {c, 0};
        mixChannels(&timg, 1, &gray0, 1, ch, 1);
        for( int l = 0; l < N; l++ ) {
            if( l == 0 ) {
                cv::Canny(gray0, gray, 0, thresh, 5);
                cv::dilate(gray, gray, cv::Mat(), cv::Point(-1,-1));
            }
            else {
                gray = gray0 >= (l+1)*255/N;
            }
            cv::findContours(gray, contours, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);
            std::vector<cv::Point> approx;
            for( size_t i = 0; i < contours.size(); i++ )
            {
                cv::approxPolyDP(cv::Mat(contours[i]), approx, arcLength(cv::Mat(contours[i]), true)*0.02, true);
                if( approx.size() == 4 && fabs(contourArea(cv::Mat(approx))) > 1000 && cv::isContourConvex(cv::Mat(approx))) {
                    double maxCosine = 0;

                    for( int j = 2; j < 5; j++ )
                    {
                        double cosine = fabs(angle(approx[j%4], approx[j-2], approx[j-1]));
                        maxCosine = MAX(maxCosine, cosine);
                    }

                    if( maxCosine < 0.3 ) {
                        squares.push_back(approx);
                    }
                }
            }
        }
    }
    return squares;
}

EDIT 17/08/2012:

编辑 17/08/2012：

To draw the detected squares on the image use this code:

要在图像上绘制检测到的方块，请使用以下代码：

cv::Mat debugSquares( std::vector<std::vector<cv::Point> > squares, cv::Mat image )
{
    for ( int i = 0; i< squares.size(); i++ ) {
        // draw contour
        cv::drawContours(image, squares, i, cv::Scalar(255,0,0), 1, 8, std::vector<cv::Vec4i>(), 0, cv::Point());

        // draw bounding rect
        cv::Rect rect = boundingRect(cv::Mat(squares[i]));
        cv::rectangle(image, rect.tl(), rect.br(), cv::Scalar(0,255,0), 2, 8, 0);

        // draw rotated rect
        cv::RotatedRect minRect = minAreaRect(cv::Mat(squares[i]));
        cv::Point2f rect_points[4];
        minRect.points( rect_points );
        for ( int j = 0; j < 4; j++ ) {
            cv::line( image, rect_points[j], rect_points[(j+1)%4], cv::Scalar(0,0,255), 1, 8 ); // blue
        }
    }

    return image;
}

Answer 1

采纳答案by karlphillip

This is a recurring subject in Stackoverflow and since I was unable to find a relevant implementation I decided to accept the challenge.

这是 Stackoverflow 中反复出现的主题，由于我找不到相关的实现，我决定接受挑战。

I made some modifications to the squares demo present in OpenCV and the resulting C++ code below is able to detect a sheet of paper in the image:

我对 OpenCV 中的方块演示做了一些修改，下面生成的 C++ 代码能够检测到图像中的一张纸：

void find_squares(Mat& image, vector<vector<Point> >& squares)
{
    // blur will enhance edge detection
    Mat blurred(image);
    medianBlur(image, blurred, 9);

    Mat gray0(blurred.size(), CV_8U), gray;
    vector<vector<Point> > contours;

    // find squares in every color plane of the image
    for (int c = 0; c < 3; c++)
    {
        int ch[] = {c, 0};
        mixChannels(&blurred, 1, &gray0, 1, ch, 1);

        // try several threshold levels
        const int threshold_level = 2;
        for (int l = 0; l < threshold_level; l++)
        {
            // Use Canny instead of zero threshold level!
            // Canny helps to catch squares with gradient shading
            if (l == 0)
            {
                Canny(gray0, gray, 10, 20, 3); // 

                // Dilate helps to remove potential holes between edge segments
                dilate(gray, gray, Mat(), Point(-1,-1));
            }
            else
            {
                    gray = gray0 >= (l+1) * 255 / threshold_level;
            }

            // Find contours and store them in a list
            findContours(gray, contours, CV_RETR_LIST, CV_CHAIN_APPROX_SIMPLE);

            // Test contours
            vector<Point> approx;
            for (size_t i = 0; i < contours.size(); i++)
            {
                    // approximate contour with accuracy proportional
                    // to the contour perimeter
                    approxPolyDP(Mat(contours[i]), approx, arcLength(Mat(contours[i]), true)*0.02, true);

                    // Note: absolute value of an area is used because
                    // area may be positive or negative - in accordance with the
                    // contour orientation
                    if (approx.size() == 4 &&
                            fabs(contourArea(Mat(approx))) > 1000 &&
                            isContourConvex(Mat(approx)))
                    {
                            double maxCosine = 0;

                            for (int j = 2; j < 5; j++)
                            {
                                    double cosine = fabs(angle(approx[j%4], approx[j-2], approx[j-1]));
                                    maxCosine = MAX(maxCosine, cosine);
                            }

                            if (maxCosine < 0.3)
                                    squares.push_back(approx);
                    }
            }
        }
    }
}

After this procedure is executed, the sheet of paper will be the largest square in vector<vector<Point> >:

执行此程序后，这张纸将是中最大的正方形vector<vector<Point> >：

opencv paper sheet detection

opencv纸张检测

I'm letting you write the function to find the largest square. ;)

我让你编写函数来找到最大的正方形。;)

Answer 2

回答by mmgp

Unless there is some other requirement not specified, I would simply convert your color image to grayscale and work with that only (no need to work on the 3 channels, the contrast present is too high already). Also, unless there is some specific problem regarding resizing, I would work with a downscaled version of your images, since they are relatively large and the size adds nothing to the problem being solved. Then, finally, your problem is solved with a median filter, some basic morphological tools, and statistics (mostly for the Otsu thresholding, which is already done for you).

除非有其他未指定的要求，否则我会简单地将您的彩色图像转换为灰度并仅使用它（无需在 3 个通道上工作，目前的对比度已经太高了）。此外，除非在调整大小方面存在一些特定问题，否则我会使用缩小版本的图像，因为它们相对较大，并且尺寸对正在解决的问题没有任何影响。然后，最后，您的问题通过中值滤波器、一些基本的形态学工具和统计数据（主要用于 Otsu 阈值处理，已经为您完成）解决了。

Here is what I obtain with your sample image and some other image with a sheet of paper I found around:

这是我从您的示例图像和其他一些我在周围找到的一张纸上获得的图像：

enter image description here

在此处输入图片说明

The median filter is used to remove minor details from the, now grayscale, image. It will possibly remove thin lines inside the whitish paper, which is good because then you will end with tiny connected components which are easy to discard. After the median, apply a morphological gradient (simply dilation- erosion) and binarize the result by Otsu. The morphological gradient is a good method to keep strong edges, it should be used more. Then, since this gradient will increase the contour width, apply a morphological thinning. Now you can discard small components.

中值滤波器用于从现在灰度的图像中去除次要细节。它可能会去除白纸内的细线，这很好，因为那样你会以易于丢弃的微小连接组件结束。在中值之后，应用形态梯度（简单地dilation- erosion）并通过 Otsu 对结果进行二值化。形态梯度是保持强边缘的好方法，应该多用。然后，由于此梯度会增加轮廓宽度，因此应用形态细化。现在您可以丢弃小组件。

At this point, here is what we have with the right image above (before drawing the blue polygon), the left one is not shown because the only remaining component is the one describing the paper:

在这一点上，这是我们上面右图的内容（在绘制蓝色多边形之前），左图没有显示，因为唯一剩下的部分是描述纸张的部分：

enter image description here

在此处输入图片说明

Given the examples, now the only issue left is distinguishing between components that look like rectangles and others that do not. This is a matter of determining a ratio between the area of the convex hull containing the shape and the area of its bounding box; the ratio 0.7 works fine for these examples. It might be the case that you also need to discard components that are inside the paper, but not in these examples by using this method (nevertheless, doing this step should be very easy especially because it can be done through OpenCV directly).

鉴于这些示例，现在剩下的唯一问题是区分看起来像矩形的组件和其他不是矩形的组件。这是确定包含形状的凸包面积与其边界框面积之间的比率的问题；对于这些示例，比率 0.7 工作正常。可能你还需要丢弃论文中的组件，但在这些示例中不需要使用这种方法（尽管如此，执行此步骤应该很容易，尤其是因为它可以直接通过 OpenCV 完成）。

For reference, here is a sample code in Mathematica:

作为参考，这里是 Mathematica 中的示例代码：

f = Import["http://thwartedglamour.files.wordpress.com/2010/06/my-coffee-table-1-sa.jpg"]
f = ImageResize[f, ImageDimensions[f][[1]]/4]
g = MedianFilter[ColorConvert[f, "Grayscale"], 2]
h = DeleteSmallComponents[Thinning[
     Binarize[ImageSubtract[Dilation[g, 1], Erosion[g, 1]]]]]
convexvert = ComponentMeasurements[SelectComponents[
     h, {"ConvexArea", "BoundingBoxArea"}, #1 / #2 > 0.7 &], 
     "ConvexVertices"][[All, 2]]
(* To visualize the blue polygons above: *)
Show[f, Graphics[{EdgeForm[{Blue, Thick}], RGBColor[0, 0, 1, 0.5], 
     Polygon @@ convexvert}]]

If there are more varied situations where the paper's rectangle is not so well defined, or the approach confuses it with other shapes -- these situations could happen due to various reasons, but a common cause is bad image acquisition -- then try combining the pre-processing steps with the work described in the paper "Rectangle Detection based on a Windowed Hough Transform".

如果有更多不同的情况，纸张的矩形没有很好地定义，或者该方法将其与其他形状混淆——这些情况可能由于各种原因而发生，但一个常见的原因是图像采集不良——然后尝试结合预- 处理步骤与论文“基于窗口霍夫变换的矩形检测”中描述的工作。

Answer 3

回答by u3547485

Well, I'm late.

嗯，我迟到了。

In your image, the paper is white, while the background is colored. So, it's better to detect the paper is Saturation(饱和度)channel in HSV color space. Take refer to wiki HSL_and_HSVfirst. Then I'll copy most idea from my answer in this Detect Colored Segment in an image.

在您的图像中，纸张是white，而背景是colored。因此，最好检测纸张是否Saturation(饱和度)在HSV color space. 首先参考wiki HSL_and_HSV。然后，我将从我在此Detect Colored Segment in an image 中的答案中复制大部分想法。

Main steps:

主要步骤：

Read into BGR
Convert the image from bgrto hsvspace
Threshold the S channel
Then find the max external contour(or do Canny, or HoughLinesas you like, I choose findContours), approx to get the corners.

读入 BGR
将图像从空间转换bgr为hsv空间
S 通道的阈值
然后找到最大外部轮廓（或者做Canny，或者HoughLines你喜欢，我选择findContours），大约得到角落。

This is my result:

这是我的结果：

The Python code(Python 3.5 + OpenCV 3.3):

Python 代码（Python 3.5 + OpenCV 3.3）：

#!/usr/bin/python3
# 2017.12.20 10:47:28 CST
# 2017.12.20 11:29:30 CST

import cv2
import numpy as np

##(1) read into  bgr-space
img = cv2.imread("test2.jpg")

##(2) convert to hsv-space, then split the channels
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
h,s,v = cv2.split(hsv)

##(3) threshold the S channel using adaptive method(`THRESH_OTSU`) or fixed thresh
th, threshed = cv2.threshold(s, 50, 255, cv2.THRESH_BINARY_INV)

##(4) find all the external contours on the threshed S
#_, cnts, _ = cv2.findContours(threshed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cv2.findContours(threshed, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)[-2]

canvas  = img.copy()
#cv2.drawContours(canvas, cnts, -1, (0,255,0), 1)

## sort and choose the largest contour
cnts = sorted(cnts, key = cv2.contourArea)
cnt = cnts[-1]

## approx the contour, so the get the corner points
arclen = cv2.arcLength(cnt, True)
approx = cv2.approxPolyDP(cnt, 0.02* arclen, True)
cv2.drawContours(canvas, [cnt], -1, (255,0,0), 1, cv2.LINE_AA)
cv2.drawContours(canvas, [approx], -1, (0, 0, 255), 1, cv2.LINE_AA)

## Ok, you can see the result as tag(6)
cv2.imwrite("detected.png", canvas)

回答by Tim

What you need is a quadrangleinstead of a rotated rectangle. RotatedRectwill give you incorrect results. Also you will need a perspective projection.

你需要的是一个四边形而不是一个旋转的矩形。 RotatedRect会给你错误的结果。您还需要透视投影。

Basicly what must been done is:

基本上必须做的是：

Loop through all polygon segments and connect those which are almost equel.
Sort them so you have the 4 most largest line segments.
Intersect those lines and you have the 4 most likely corner points.
Transform the matrix over the perspective gathered from the corner points and the aspect ratio of the known object.

循环遍历所有多边形段并连接那些几乎相等的多边形段。
对它们进行排序，以便获得 4 个最大的线段。
将这些线相交，您将获得 4 个最有可能的角点。
在从角点和已知对象的纵横比收集的透视图上变换矩阵。

I implemented a class Quadranglewhich takes care of contour to quadrangle conversion and will also transform it over the right perspective.

我实现了一个类Quadrangle，它负责将轮廓转换为四边形，并且还将在正确的视角上对其进行转换。

See a working implementation here: Java OpenCV deskewing a contour

在此处查看工作实现： Java OpenCV 校正轮廓

Answer 5

回答by nathancy

Once you have detected the bounding box of the document, you can perform a four-point perspective transformto obtain a top-down birds eye view of the image. This will fix the skew and isolate only the desired object.

检测到文档的边界框后，您可以执行四点透视变换以获得图像的自上而下的鸟瞰图。这将修复倾斜并仅隔离所需的对象。

Input image:

输入图像：

Detected text object

检测到的文本对象

Top-down view of text document

文本文档的自上而下视图

Code

代码

from imutils.perspective import four_point_transform
import cv2
import numpy

# Load image, grayscale, Gaussian blur, Otsu's threshold
image = cv2.imread("1.png")
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur = cv2.GaussianBlur(gray, (7,7), 0)
thresh = cv2.threshold(blur, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]

# Find contours and sort for largest contour
cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL,cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]
cnts = sorted(cnts, key=cv2.contourArea, reverse=True)
displayCnt = None

for c in cnts:
    # Perform contour approximation
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.02 * peri, True)
    if len(approx) == 4:
        displayCnt = approx
        break

# Obtain birds' eye view of image
warped = four_point_transform(image, displayCnt.reshape(4, 2))

cv2.imshow("thresh", thresh)
cv2.imshow("warped", warped)
cv2.imshow("image", image)
cv2.waitKey()

Answer 6

回答by Anubhav Rohatgi

Detecting sheet of paper is kinda old school. If you want to tackle skew detection then it is better if you straightaway aim for text line detection. With this you will get the extremas left, right, top and bottom. Discard any graphics in the image if you dont want and then do some statistics on the text line segments to find the most occurring angle range or rather angle. This is how you will narrow down to a good skew angle. Now after this you put these parameters the skew angle and the extremas to deskew and chop the image to what is required.

检测纸张有点老派。如果您想解决歪斜检测，那么最好直接针对文本行检测。有了这个，您将获得左、右、上和下的极值。如果您不想，请丢弃图像中的任何图形，然后对文本线段进行一些统计以找到最常出现的角度范围或角度。这就是您将缩小到一个好的倾斜角度的方法。现在，在此之后，您将倾斜角和极值设置为倾斜角和将图像剪切到所需的参数。

As for the current image requirement, it is better if you try CV_RETR_EXTERNAL instead of CV_RETR_LIST.

对于当前的图像需求，最好尝试 CV_RETR_EXTERNAL 而不是 CV_RETR_LIST。

Another method of detecting edges is to train a random forests classifier on the paper edges and then use the classifier to get the edge Map. This is by far a robust method but requires training and time.

另一种检测边缘的方法是在纸的边缘上训练一个随机森林分类器，然后使用分类器得到边缘图。到目前为止，这是一种稳健的方法，但需要培训和时间。

Random forests will work with low contrast difference scenarios for example white paper on roughly white background.

随机森林将适用于低对比度差异场景，例如大致白色背景上的白皮书。

OpenCV C++/Obj-C：检测一张纸/正方形检测

提问by dom

采纳答案by karlphillip

回答by mmgp

回答by u3547485

Main steps:

主要步骤：

回答by Tim

回答by nathancy

回答by Anubhav Rohatgi

相关推荐

最近更新

标签

OpenCV C++/Obj-C：检测一张纸/正方形检测

提问by dom

采纳答案by karlphillip

回答by mmgp

回答by u3547485

Main steps:

主要步骤：

回答by Tim

回答by nathancy

回答by Anubhav Rohatgi

相关推荐

用 C/C++ 创建和发送数据包

c++，usleep() 已过时，Windows/MingW 的解决方法？

C++ 从基类构造函数调用纯虚函数

C++ 如何在opengl中在y或x轴上绘制圆柱体

相关推荐

最近更新

标签