haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语

haar级联分类器车辆检测

Haar级联分类器概述(An overview of Haar cascade classifier)

Haar cascade classifier is an open cv algorithm. It makes classification between images with an object ( i.e face) and images without an object (i.e with non-faces).

Haar级联分类器是一种开放式简历算法。 它可以在带有对象(即人脸)的图像和没有对象(即非人脸)的图像之间进行分类。

Initially, several hundreds of images with face and several hundreds of images with non-faces have been given to this classifier. This classifier was then trained by applying machine learning methods like the neural networks to recognize human faces. It then extracted Haar Features from those images and stored them in an xml file.

最初,已经给该分类器提供了数百张带有面部的图像和几百张具有非面部的图像。 然后通过应用机器学习方法(例如神经网络)识别人脸来训练该分类器。 然后从这些图像中提取Haar功能,并将其存储在xml文件中。

什么是Haar功能以及如何提取这些功能? (What are Haar Features and how these features are extracted?)

Basically for face detection, the classifier looks for the most relevant features on the face such as eyes, nose, lips, forehead, eyebrows because we know that although people have different looks, these features are in the similar positions on the face.

基本上脸détection,分类查找该面对的最相关的特征,如眼睛,鼻子,嘴唇,额头,眉毛,因为我们知道,虽然人们有不同的外观,这些功能都在面对相似的立场。

Haar features are white and black pixels on the face. Also, we know that the greyscale image of a face does not have completely white and black pixels but here we are considering an ideal case where white pixels are lighter pixels and black pixels are darker pixels.

Haar特征是脸上的白色黑色像素。 同样,我们知道人脸的灰度图像不具有完全的白色和黑色像素,但是在这里我们考虑的理想情况是白色像素是亮的像素,黑色像素是较暗的像素

In the above figure, eyebrow contains darker pixels and forehead contains lighter pixels. Similarly, it goes for eyes, nose and lips.

在上图中,眉毛包含较暗的像素,前额包含较亮的像素。 同样,它适用于眼睛,鼻子和嘴唇。

比较如何完成 (How comparison is done)

Now if a new input image is given to the classifier, it compares the Haar Features from the xml file and applies it to the input image. If it passes through all the stages of haar feature comparison, then it’s a face, else not.

现在,如果将新的输入图像提供给分类器,它将比较xml文件中的Haar功能并将其应用于输入图像。 如果它经过了haar特征比较的所有阶段,那么它就是一张面Kong,否则就不是。

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第1张图片
Pixel intensities for an ideal case and for real face image 像素强度适合理想情况和真实面部图像

For calculation purpose, we can consider the pixel values for a face to be somewhat like the one in the above figure where the ideal case has 0 value for white pixel and 1 for a black pixel. Each feature is obtained by subtracting the average of lighter pixel values from the average of darker pixel values.

出于计算目的,我们可以认为面部的像素值有点像上图中的像素值,理想情况下,白色像素为0 ,黑色像素为1通过从较暗像素值的平均值中减去较亮像素值的平均值来获得每个功能

For the ideal case, this difference between the average of black and white pixel values is 1. Therefore, for the real image, the closer this difference to 1, the more likely we have found a Haar feature.

对于理想情况,黑白像素值的平均值之间的差异为1。因此,对于真实图像,该差异越接近1,我们就越有可能发现Haar特征。

人脸检测中使用的一些术语 (Some Terminologies Used In Face Detection)

1.图像金字塔和比例因子(1. Image Pyramid and Scale Factor)

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第2张图片
source 资源

Since the classifier model was trained with the fixed face size images which can be seen in the xml file. It can detect faces with the same size which was used during the training. So what if our input image has faces smaller or bigger than what was used then. Now, for this reason, we want our image in different scales so that the model can detect a face.

由于分类器模型是使用固定的面部尺寸图像训练的,因此可以在xml文件中看到。 它可以检测到训练期间使用的相同大小的面部。 那么,如果我们输入的图像的面小于或大于使用的面,该怎么办。 现在,由于这个原因,我们希望图像具有不同的比例,以便模型可以检测到面部。

Now the question arises that by how much the image should be scaled? And finally here comes our image pyramid and scale factor which specifies how much we reduce the image size each time we scale.

现在出现的问题是图像应缩放多少? 最后是图像金字塔比例因子,它指定每次缩放时减少多少图像尺寸。

In the above figure(1) and figure(2) we can see that at each layer the image is downsized by some factor. This is the image pyramid which finds the face at different scales of an image. And the factor by which the image is downsized at each scale is the scale factor. So what should be the optimum scale factor?

在上面的图(1)和图(2)中,我们可以看到在每层图像上缩小了一定比例。 这是图像金字塔,可以在图像的不同比例下找到人脸。 而在每个比例尺上缩小图像尺寸的因素就是比例尺因素。 那么最佳比例因子应该是多少?

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第3张图片
Scale factor =1.05 Who is next to Joey and Chandler other than Phoebe and Ross 比例因子= 1.05除了Phoebe和Ross外,谁在Joey和Chandler旁边?

Scale factor =1.05 Means we reduce the image size by 5% each time we scale. This would generate more number of layers since the image is downsized each time by only 5%. With this large number of layers, the algorithm works slower. And since it is more precise the algorithm would take some non-face areas into consideration.

比例因子= 1.05表示每次缩放时,图像尺寸减小5%。 这将产生更多的层数,因为图像每次仅缩小5%。 拥有如此众多的图层,该算法的运行速度会变慢。 而且由于更加精确,该算法将考虑一些非面部区域。

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第4张图片
Scale factor =1.8 Monica, you’re not recognised 比例因子= 1.8 Monica,您未被识别

Scale factor =1.8 This large value would generate less number of layers and accordingly the algorithm works faster and can miss some faces.

比例系数= 1.8 这个较大的值将生成较少的层数,因此该算法工作更快,并且可能会丢失一些面Kong。

Scale factor = 1.3 is the value which we take into consideration giving the output as

比例因子= 1.3是我们将输出考虑为的值

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第5张图片
Scale factor = 比例因子= 1.3 Perfect One 1.3完美的一个

Now comes the definition. An image pyramid is a multi-scale representation of an image to find objects in images at different scales.

现在来定义。 图像金字塔是图像的多比例表示可以在图像中以不同比例查找对象。

2.推拉窗 (2. Sliding Window)

A sliding window is a rectangular region that shifts around the whole image(pixel-by-pixel) at each scale. Each time the window shifts, the window region is applied to the classifier and detects whether that region has Haar features of a face.

滑动窗口是一个矩形区域,它在整个图像上(每个像素)移动(逐像素)。 每次窗口移动时,窗口区域都会应用于分类器,并检测该区域是否具有脸部的Haar特征。

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第6张图片

This is what the sliding window combined with the image pyramid looks like (below figure). By this, we can detect face at different scales and locations of an image.

这就是滑动窗口图像金字塔相结合的样子(下图)。 这样,我们可以检测到不同比例和图像位置的人脸。

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第7张图片
source 资源

3.级联方法(3. Cascade Method)

Since most part of an image does not contain faces, applying all the Haar features to a single window(region without a face) could be time-consuming. So, the cascade of classifiers is used. A cascade of classifier consists of multiple stages of filters. The Haar features are grouped into these different stages of classifiers.

由于图像的大部分不包含面部,因此将所有Haar特征应用于单个窗口(无面部区域)可能会很耗时。 因此,使用分类器级联。 分类器级联由多个阶段的过滤器组成。 Haar功能分为分类器的这些不同阶段。

Each time the sliding window shifts, the comparison is done through the cascade of classifiers stage by stage. If the window fails the first stage Haar feature comparison, the classifier will not consider that window as a face and without comparing with the remaining stages of Haar feature comparison it will move to the next window.

每次滑动窗口移动时,都会逐步通过分类器的级联进行比较。 如果该窗口未通过第一阶段Haar特征比较,则分类器将不会将该窗口视为人脸,并且不与Haar特征比较的其余阶段进行比较,它将移至下一个窗口。

Finally, the window which passes all the stages will be a face region.

最后,经过所有阶段的窗口将是一个面部区域。

4.最小尺寸 (4. Minimum Size)

The size of a sliding window is [minSize x minSize]. So the object (here face) should be at least minSize to get detected.

滑动窗口的大小为[minSize x minSize]。 因此,要检测到的对象(此处为人脸)至少应为minSize。

5.最小邻居 (5. Minimum Neighbors)

Since the object detection works in the combination of the image pyramid (multi-scaling) and sliding window, we get multiple true outputs for a single region of the face. These true outputs are the window region which satisfies the Haar features (could be actual face area or a non-face area taken into consideration).

由于对象检测是在图像金字塔(多比例缩放)滑动窗口的组合中起作用的,因此对于脸部的单个区域我们可以获得多个真实输出。 这些真实的输出是满足Haar特征的窗口区域(可以是实际的面部区域,也可以是非面部区域)。

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第8张图片

Therefore, to get an actual face rectangle, the number of rectangles for the same face region must be higher than the minNeighbors.

因此,要获得实际的面部矩形,相同面部区域的矩形数量必须大于minNeighbors

minNeighbor = 0. In the figure below some areas have multiple true outputs.

minNeighbor =0。在下图中,某些区域具有多个真实输出。

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第9张图片

Setting minNeighbor = 5, the area with less than 5 true outputs will not get detected by the classifier. Since minNeighbor is the threshold value for the number of true outputs required to detect a face.

设置minNeighbor = 5,分类器将不会检测到真实输出少于5个的区域。 由于minNeighbor是检测人脸所需的真实输出数量的阈值。

haar级联分类器车辆检测_Haar级联分类器Open CV用于人脸检测的术语_第10张图片
source 资源

Hope you found this useful and interesting.

希望您发现这个有用和有趣。

翻译自: https://medium.com/ai-in-plain-english/terminologies-used-in-face-detection-with-haar-cascade-classifier-open-cv-6346c5c926c

haar级联分类器车辆检测

你可能感兴趣的:(opencv,人脸识别,计算机视觉)