C++ Viola-Jones 人脸检测方法是如何工作的?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/5808434/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How does the Viola-Jones face detection method work?
提问by BlackShadow
Please explain to me, in few words, how the Viola-Jones face detection method works.
请用几句话向我解释 Viola-Jones 人脸检测方法的工作原理。
回答by cMinor
The Viola-Jones detector is a strong, binary classifierbuild of several weak detectors
Viola-Jones 检测器是由多个弱检测器构建的强大的二元分类器
Each weak detector is an extremely simple binary classifier
During the learning stage, a cascade of weak detectors is trained so as to gain the desired hit rate / miss rate (or precision / recall) using Adaboost To detect objects, the original image is partitioned in several rectangular patches, each of which is submitted to the cascade
在学习阶段,使用 Adaboost 训练一系列弱检测器以获得所需的命中率/未命中率(或准确率/召回率)来检测对象,将原始图像划分为几个矩形块,每个块都提交到级联
If a rectangular image patch passes through all of the cascade stages, then it is classified as “positive” The process is repeated at different scales
如果一个矩形图像块通过了所有的级联阶段,那么它就被归类为“正”这个过程在不同的尺度上重复
Actually, at a low level, the basic component of an object detector is just something required to say if a certain sub-region of the original image contains an istance of the object of interest or not. That is what a binary classifier does.
实际上,在低级别上,对象检测器的基本组件只是需要说明原始图像的某个子区域是否包含感兴趣对象的一个区域。这就是二元分类器的作用。
The basic, weak classifier is based on a very simple visual feature (those
kind of features are often referred to as “Haar-like features”)
基本的弱分类器基于一个非常简单的视觉特征(这些特征通常被称为“类 Haar 特征”)
Haar-like features consist of a class of local features that are calculated by subtracting the sum of a subregion of the feature from the sum of the remaining region of the feature.
Haar-like 特征由一类局部特征组成,这些特征是通过从特征的剩余区域的总和中减去该特征的一个子区域的总和来计算的。
These feature are characterised by the fact that they are easy to calculate and with the use of an integral image, very efficient to calculate.
这些特征的特点是它们易于计算并且使用积分图像,计算非常有效。
Lienhart introduced an extended set of twisted Haar-like feature (see image)
Lienhart 引入了一组扩展的类似 Haar 的扭曲特征(见图)
These are the standard Haar-like feature that have been twisted by 45 degrees. Lienhart did not originally make use of the twisted checker board Haar-like feature (x2y2) since the diagonal elements that they represent can be simply represented using twisted
features, however it is clear that a twisted version of this feature can also be implemented and used.
这些是被扭曲了 45 度的标准 Haar 样特征。Lienhart 最初没有使用扭曲棋盘格 Haar-like 特征 (x2y2),因为它们表示的对角线元素可以简单地使用扭曲特征表示,但是很明显,也可以实现和使用此特征的扭曲版本.
These twisted Haar-like features can also be fast and efficiently calculated using an integral image that has been twisted 45 degrees. The only implementation issue is that the twisted features must be rounded to integer values so that they are aligned with pixel boundaries. This process is similar to the rounding used when scaling a Haar-like feature for larger or smaller windows, however one difference is that for a 45 degrees twisted feature, the integer number of pixels used for the height and width of the feature mean that the diagonal coordinates of the pixel will be always on the same diagonal set of pixels
使用扭曲 45 度的积分图像也可以快速有效地计算这些扭曲的 Haar 样特征。唯一的实现问题是扭曲特征必须四舍五入为整数值,以便它们与像素边界对齐。此过程类似于为更大或更小的窗口缩放 Haar-like 特征时使用的舍入,但是一个区别是对于 45 度扭曲特征,用于特征高度和宽度的整数像素意味着像素的对角线坐标将始终位于相同的像素对角线上
This means that the number of different sized 45 degrees twisted features available is significantly reduced as compared to the standard vertically and horizontally
aligned features.
这意味着与标准的垂直和水平对齐特征相比,可用的不同大小的 45 度扭曲特征的数量显着减少。
So we have something like:
所以我们有类似的东西:
About the formula, the Fast computation of Haar-like features using integral images looks like:
关于公式,使用积分图像快速计算类 Haar 特征如下所示:
Finally, here is a c++ implementationwhich uses ViolaJones.hby Ivan Kusalic
最后,这里是一个C ++实现它使用ViolaJones.h伊万Kusalic
to see the complete c++ project go here
要查看完整的 C++ 项目,请点击此处
回答by Jayhello
The Viola-Jones detector is a strong binary classifierbuild of several weak detectors. Each weak detector is an extremely simple binary classifier
Viola-Jones 检测器是由多个弱检测器构建的强二元分类器。每个弱检测器都是一个极其简单的二元分类器
The detection consists of below parts:
检测由以下部分组成:
Haar Filter
: extract features from image to calssify(features act to encode ad-hoc domain knowledge)
Haar Filter
:从图像中提取特征以进行分类(特征用于编码特定领域知识)
Integral Image
: allows for very fast feature evaluation
Integral Image
:允许非常快速的特征评估
Cascade Classifier
: A cascade classifier consists of multiple stages of filters, to classify a image( sliding window of a image) is a face.
Cascade Classifier
:级联分类器由多级过滤器组成,用于分类图像(图像的滑动窗口)是人脸。
Below is an overview of how to detect a face in image.
下面是如何检测图像中的人脸的概述。
A detection window shifts around the whole image extract feature(by
haar filter
computed byIntegral Image
then send the extracted feature toCascade Classifier
to classify if it is a face). The sliding window shifts pixel-by-pixel. Each time the window shifts, the image region within the window will go through the cascade classifier.
检测窗口在整个图像提取特征周围移动(通过
haar filter
计算Integral Image
然后将提取的特征发送Cascade Classifier
到分类是否是人脸)。滑动窗口逐像素移动。每次窗口移动时,窗口内的图像区域都会经过级联分类器。
Haar Filter
: You can understand the the filter can extract features like eyes
, bridge of the nose
and so on.
Haar Filter
:你可以理解过滤器可以提取特征,比如eyes
,bridge of the nose
等等。
Integral Image
: allows for very fast feature evaluation
Integral Image
:允许非常快速的特征评估
Cascade Classifier
:
Cascade Classifier
:
A cascade classifier consists of multiple stages of filters, as shown in the figure below. Each time the sliding window shifts, the new region within the sliding window will go through the cascade classifier stage-by-stage. If the input region fails to pass the threshold of a stage, the cascade classifier will immediately reject the region as a face. If a region pass all stages successfully, it will be classified as a candidate of face, which may be refined by further processing.
级联分类器由多级滤波器组成,如下图所示。每次滑动窗口移动时,滑动窗口内的新区域将逐步通过级联分类器。如果输入区域未能通过某个阶段的阈值,级联分类器将立即将该区域拒绝为人脸。如果一个区域成功通过所有阶段,它将被归类为候选人脸,可以通过进一步处理进行细化。
For more details:
更多细节:
Firstly, I suggest you to read the source paper Rapid Object Detection using a Boosted Cascade of Simple Featuresto have a overview understanding of the method.
首先,我建议您阅读源文件使用简单特征的 Boosted Cascade 进行快速对象检测,以对该方法有一个大致的了解。
If you can't understand it clearly, you can see Viola-Jones Face Detectionor Implementing the Viola-Jones Face Detection Algorithmor Study of Viola-Jones Real Time Face Detectorfor more details.
如果看不清楚,可以看Viola-Jones Face Detection或实现 Viola-Jones Face Detection Algorithm或Viola-Jones Real Time Face Detector 的研究了解更多细节。
Here is a python code Python implementation of the face detection algorithm by Paul Viola and Michael J. Jones.
这是Paul Viola 和 Michael J. Jones 的人脸检测算法的 Python 代码Python 实现。
matlab code here.
matlab代码在这里。