PS:本文纯粹自己想学习OpenCV来复现一遍,随意翻译,可能用词大量不准确,如有异议错误欢迎指正
需要下载的文件可以在这里找到(免费)
作者原文链接:https://www.pyimagesearch.com/2018/07/19/opencv-tutorial-a-guide-to-learn-opencv/
OpenCV Tutorial: A Guide to Learn OpenCV
This OpenCV tutorial is for beginners just getting started learning the basics. Inside this guide, you’ll learn basic image processing operations using the OpenCV library using Python.
为初学者提供参考
将了解:
- 用python基础的图片处理
And by the end of the tutorial you’ll be putting together a complete project to count basic objects in images using contours.
最后,你会建立一个完整的工程,能够用框,框出图中的物体,并得到个数。
While this tutorial is aimed at beginners just getting started with image processing and the OpenCV library, I encourage you to give it a read even if you have a bit of experience.
A quick refresher in OpenCV basics will help you with your own projects as well.
Installing OpenCV and imutils on your system
The first step today is to install OpenCV on your system (if you haven’t already).
请先安装OpenCV
他提供了在树莓派,Ubuntu和macOS中安装的教程
I maintain anOpenCV Install Tutorialspage which contains links to previous OpenCV installation guides for Ubuntu, macOS, and Raspberry Pi.
你最好先看那些教程,并执行亲自执行一遍(我觉得...)
You should visit that page and find + follow the appropriate guide for your system.
Once your fresh OpenCV development environment is set up,install the imutils package via pip/. I have created and maintained imutils
(source on GitHub) for the image processing community and it is used heavily on my blog. You should install imutils
in the same environment you installed OpenCV into — you’ll need it to work through this blog post as it will facilitate basic image processing operations:
$ pip install imutils
_Note:_ If you are using Python virtual environments don’t forget to use the workon
command to enter your environment before installing imutils
!
如果你用的是python虚拟环境,还要用workon
命令进入你自己配的环境来安装imutils
OpenCV Project Structure
OpenCV工程的结构
Before going too far down the rabbit hole, be sure to grab the code + images from the “Downloads”section of today’s blog post.rabbit hole
:比喻进入未知领域的入口,或者一时半会难以搞定的事
开始之前记得下载好作者给的代码+图片
From there, navigate to where you downloaded the .zip in your terminal (cd
). And then we canunzip
the archive, change working directories (cd
) into the project folder, and analyze the project structure via tree
:
我们进目录看一下下载文件的结构,用tree
命令(一个可视化文件结构的,安装命令:sudo apt-get install tree
)
$ cd ~/Downloads
$ unzip opencv-tutorial.zip
$ cd opencv-tutorial
$ tree
.
├── jp.png
├── opencv_tutorial_01.py
├── opencv_tutorial_02.py
└── tetris_blocks.png
0 directories, 4 files
In this tutorial we’ll be creating two Python scripts to help you learn OpenCV basics:
在这个教程中,我们将创建两个python脚本来帮助大家学OpenCV:
1.Our first script,opencv_tutorial_01.py
will cover basic image processing operations using an image from the movie,Jurassic Park( jp.png ).
- 第一个脚本
opencv_tutorial_01.py
,包括基本的图像处理
2.From there,opencv_tutorial_02.py
will show you how to use these image processing building blocks to create an OpenCV application to count the number of objects in a Tetris image (tetris_blocks.png ).
- 第二个
opencv_tutorial_02.py
,向我们展示用这些图像处理技术来构建一个OpenCV工程,在目标图片中数物体数目。
Loading and displaying an image
导入并展示图片
Figure 1:Learning OpenCV basics with Python begins with loading and displaying an image — a simple process that requires only a few lines of code.
下面就是加载图片的简单代码
Let’s begin by opening upopencv_tutorial_01.py
in your favorite text editor or IDE:
选个你喜欢的编译器打开upopencv_tutorial_01.py
# import the necessary packages #1
import imutils #2
import cv2 #3
#4
# load the input image and show its dimensions, keeping in mind that #5
# images are represented as a multi-dimensional NumPy array with 图片用多维数组表示 #6
# shape no.rows (height) x no.columns (width) x no. channels (depth)颜色通道 #7
image = cv2.imread("jp.png") #8
(h, w, d) = image.shape #9
print("width={}, height={}, depth={}".format(w, h, d)) #10
#11
# display the image to our screen -- we will need to click the window #12
# open by OpenCV and press a key on our keyboard to #11 continue execution #13
cv2.imshow("Image", image) #14
cv2.waitKey(0) #15
On Lines 2 and 3we import both imutils
and cv2
. The cv2
package is OpenCV and despite the 2 embedded, it can actually be OpenCV 3 (or possibly OpenCV 4 which may be released later in 2018). The imutils
package is my series of convenience functions.
第二三行导入包
Now that we have the required software at our fingertips via imports, let’s load an image from disk into memory.
现在从磁盘读图片
To load our_Jurassic Park_ image (from one of my favorite movies), we callcv2.imread("jp.png")
. As you can see on Line 8image = cv2.imread("jp.png")
, we assign the result to image . Our image is actually just a NumPy array.
我们将结果定义为图像,而图像在计算机中实际上是当做三维数组处理的,其维度是height*width*颜色通道数(彩色就是RGB3
通道,黑白只有灰度,所以是2
)
Later in this script, we’ll need the height and width. So on Line 9, I call image.shape
to extract the height, width, and depth.
用来提取图片三维 长、宽、深(就是颜色通道数)
It may seem confusing that the height comes before the width, but think of it this way:
- We describe matrices by_# of rows x # of columns_
- The number of _rows_is our _height_(行数是我们的高)
- And the number of _columns_is our _width_(列数是我们的宽)
Therefore, the dimensions of an image represented as a NumPy array are actually represented as _(height, width, depth)._
所以啊,图片表示为一个NumPy
数组,它的维度数就代表(长 宽 深)
Depth is the number of channels — in our case this is three since we’re working with 3 color channels: Blue, Green, and Red.
Depth 深度就是(颜色)通道数,彩色的就是蓝
绿
红
三个通道.
The print command shown on Line 10will output the values to the terminal:
print("width={}, height={}, depth={}".format(w, h, d))
命令框中输出
width =600, height =322, depth =3
To display the image on the screen using OpenCV we employ cv2.imshow("Image",image)
on Line 14.
第14行的代码cv2.imshow("Image",image)
让代码用OpenCV显示出来
The subsequent line waits for a keypress (Line 15)cv2.waitKey(0)
.
第15行代码让界面等到我们键盘按下一个键的时候关闭.
This is important otherwise our image would display and disappear faster than we’d even see the image.
_Note:_ You need to actually click the active window opened by OpenCV and press a key on your keyboard to advance the script. OpenCV cannot monitor your terminal for input so if you a press a key in the terminal OpenCV will not notice. Again, you will need to click the active OpenCV window on your screen and press a key on your keyboard.
大概就是你用命令框打开了OpenCV窗口,想键盘被按时,OpenCV窗口有响应要点击OpenCV窗口.
Accessing individual pixels
处理独立像素
Figure 2:_Top_: grayscale gradient(灰度梯度?) where brighter pixels are closer to 255 and darker pixels are closer to 0._Bottom_: RGB venn diagram where brighter pixels are closer to the center.
明亮的颜色趋于255 暗淡的颜色0
RGB维恩图中,越明亮的颜色离中心越近
First, you may ask:
What is a pixel(像素)?
All images consist of pixels which are the raw building blocks of images.
所有的图片都是由最基础的像素组成的.
Images are made of pixels in a grid.
一格格的像素组成了我们看到的图片.
A 640 x 480 image has 640 columns (the width) and 480 rows (the height). There are 640*480=307200 pixels in an image with those dimensions.
Each pixel in a grayscale image has a value representing the shade of gray.In OpenCV, there are 256 shades of gray — from 0 to 255. So a grayscale image would have a grayscale value associated with each pixel.
每个像素在灰度图中都按照深浅对应一个0-255 的数值
Pixels in a color image have additional information. There are several color spaces that you’ll soon become familiar with as you learn about image processing. For simplicity let’s only consider the RGB color space.
彩图中的像素有额外的信息.我们等会只考虑RGB三个颜色空间
In OpenCV color images in the RGB (Red, Green, Blue) color space have a 3-tuple associated with each pixel:(B,G,R) .
RGB三色度空间中,每个像素关联一个三元组.
Notice the ordering is BGR
rather than RGB. This is because when OpenCV was first being developed many years ago the standard was BGR ordering. Over the years, the standard has now become RGB but OpenCV still maintains this “legacy” BGR ordering
to ensure no existing code breaks.
OpenCV中色度顺序是bgr不是rgb.
Each value in the BGR 3-tuple has a range of[0,255] . How many color possibilities are there for each pixel in an RGB image in OpenCV? That’s easy : 256*256*256=16777216 .
BGR 3元组中,每个值都在范围[0,255]内
有组合种类:256*256*256=16777216
Now that we know exactly what a pixel is, let’s see how to retrieve the value of an individual pixel in the image:
# access the RGB pixel located at x=50, y=100, keepind in mind that #17
# OpenCV stores images in BGR order rather than RGB #18
(B, G, R) = image[100, 50] #19
print("R={}, G={}, B={}".format(R, G, B)) #20
As shown previously, our image dimensions arewidth = 600, height = 322, depth = 3. We can access individual pixel values in the array by specifying the coordinates so long as they are within the max width and height.
在图片长宽范围内,我们可以通过坐标找到对应像素.
The code,image[100,50] , yields a 3-tuple of BGR values from the pixel located at x =50 and y =100 (again, keep in mind that the_height_is the number of_rows_and the_width_is the number of_columns_— take a second now to convince yourself this is true). 高是行
宽是列
As stated above, OpenCV stores images in BGR ordering (unlike Matplotlib, for example). Check out how simple it is to extract the color channel values for the pixel on Line 19.(B, G, R) = image[100, 50]
第19行提取color channel values
The resulting pixel value is shown on the terminal here:结果
R = 41, G =49, B =37
Array slicing and cropping
处理图片数组
Extracting “regions of interest”
(ROIs
) is an important skill for image processing.
处理图片中我们感兴趣的部分.
Say, for example, you’re working on recognizing faces in a movie.
比如你要在电影中进行人脸识别
First, you’d run a face detection algorithm
to find the coordinates of faces in all the frames you’re working with.
首先,你要用face detection
函数,在所有框架中找到脸的坐标.
Then you’d want to extract the face ROIs and either save them or process them.
提取面部ROIs
,保存或者处理它们
Locating all frames containing Dr. Ian Malcolm(某个人)
in_Jurassic Park_
(某部电影)would be a great face recognition mini-project to work on.
For now, let’s just_manually_extract an ROI. This can be accomplished with array slicing
.
我们来亲自执行一下,试着去提取ROIs
我们用这个操作array slicing
去完成.
Figure 3:Array slicing with OpenCV allows us to extract a region of interest (ROI) easily.
# extract a 100x100 pixel square ROI (Region of Interest) from the #22
# input image starting at x=320,y=60 at ending at x=420,y=160 #23
roi = image [60:160, 320:420] #24
cv2.imshow("ROI", roi) #25
cv2.waitKey(0) #26
Array slicing is shown on Line 24roi = image [60:160, 320:420] #24
with the format格式: image[start_Y:end_Y,startX:endX] .
This code grabs an roi
which we then display on Line 25. cv2.imshow("ROI", roi)
Just like last time, we display until a key is pressed (Line 26).cv2.waitKey(0)
As you can see in Figure 3, we’ve extracted the face of Dr.Ian Malcolm.
I actually predetermined the(x, y)-coordinates using Photoshop for this example,
作者提前直接用Photoshop得到那个人脸的坐标了
but if you stick with me on the blog you could[detect and extract face ROI’s automatically(https://www.pyimagesearch.com...
emmm我等会也写个博客记录一下吧
Resizing images
改变图像大小
Resizing images is important for a number of reasons.
1.you might want to resize a large image to fit on your screen. Image processing is also faster
on smaller images because there are fewer pixels
to process. In the case of deep learning, we often resize images, ignoring aspect ratio
横纵比, so that the volume fits into a network which requires that an image be square and of a certain dimension.
Let’s resize our original image to 200 x 200 pixels:
# resize the image to 200x200px, ignoring aspect ratio #28
resized = cv2.resize(image, (200, 200)) #29
cv2.imshow("Fixed Resizing", resized) #30
cv2.waitKey(0) #31
On Line 29resized = cv2.resize(image, (200, 200))
we have resized an image ignoring aspect ratio. Figure 4(_right_) shows that the image is resized but is now distorted because we didn’t take into account the aspect ratio.因为没考虑横纵比,所以那个小图片有点扭曲
Figure 4:Resizing an image with OpenCV and Python can be conducted withcv2.resize
however aspect ratio is not preserved automatically.
Let’s calculate the aspect ratio of the original image and use it to resize an image so that it doesn’t appear squished and distorted:
计算一下原图的横纵比,然后再用这个横纵比来缩小图片,这样就不会扭曲了.
# fixed resizing and distort aspect ratio so let's resize the width 33
# to be 300px but compute the new height based on the aspect ratio 34
r = 300.0 / w # 35
dim = (300, int(h * r)) #36
resized = cv2.resize(image, dim) #37
cv2.imshow("Aspect Ratio Resize", resized) #38
cv2.waitKey(0) #39
Recall back to Line 9 (h, w, d) = image.shape
of this script where we extracted the width and height of the image.得到了图片的宽和高
Let’s say that we want to take our 600-pixel wide image and resize it to 300 pixels wide while_maintaining aspect ratio_.
我们想把600像素的图片压缩到300像素,并且保持横纵比
On Line 35r = 300.0 / w
we calculate the ratio of the _new width_to the _old width_(which happens to be 0.5).
From there, we specify our dimensions of the new image,dim.
We know that we want a 300-pixel wide image, but we must calculate the height using the ratio by multiplying h
by r
(the original height and our ratio respectively).
Feeding dim
(our dimensions) into the cv2.resize function, we’ve now obtained a new image named resized
which is not distorted (Line 37).resized = cv2.resize(image, dim)
To check our work, we display the image using the code on Line 38:cv2.imshow("Aspect Ratio Resize", resized)
Figure 5: Resizing images while maintaining aspect ratio with OpenCV is a three-step process:
保持横纵比的压缩图片,如下三步走
(1) extract the image dimensions 提取图片维度
(2) compute the aspect ratio 计算横纵比
(3) resize the image (cv2.resize
) along one dimension先在一个维度上改变
and multiply the other dimension by the aspect ratio. 再用横纵比乘其他维度
SeeFigure 6for an even easier method.
But can we make this process of preserving aspect ratio during resizing even easier?
Yes!
简化一下
Computing the aspect ratio each time we want to resize an image is a bit tedious, so I wrapped the code in a function within imutils
.
在这个包里imutils
,把这项功能写成函数.
Here is how you may use imutils.resize
:
# manually computing the aspect ratio can be a pain so let's use the 41
# imutils library instead 42
resized = imutils.resize(image, width =300) #43
cv2.imshow("Imutils Resize", resized) #44
cv2.waitKey(0) #45
In a single line of code, we’ve preserved aspect ratio and resized the image.只要一行代码就能完成我们的任务
Simple right?
All you need to provide is your target width
or target height
as a keyword argument (Line 43).
你只需要提供 宽
和 高
就行
Here’s the result:
Figure 6:If you’d like to maintain aspect ratio while resizing images with OpenCV and Python, simply useimutils.resize
. Now your image won’t risk being “squished” as inFigure 4.
Rotating an image
旋转图像
Let’s rotate our_Jurassic Park_image for our next example:
# let's rotate an image 45 degrees clockwise using OpenCV by first 47
# computing the image center, then constructing the rotation matrix, 48
# and then finally applying the affine warp 49
center = (w // 2, h // 2) #50
M = cv2.getRotationMatrix2D(center, -45, 1.0) #51
rotated = cv2.warpAffine(image, M, (w, h)) #52
cv2.imshow("OpenCV Rotation", rotated) #53
cv2.waitKey(0) #54
Rotating an image about the center point requires that we first calculate the center (x, y)-coordinates of the image (Line 50).
通过中心点旋转图片,首先要找到中心坐标
_Note: We use//
to perform integer math (i.e., no floating point values)_.整除?
From there we calculate a rotation matrix,M
得到旋转矩阵M
(Line 51). M = cv2.getRotationMatrix2D(center, -45, 1.0)
The -45 means that we’ll rotate the image 45 degrees clockwise. 顺时针转45度
Recall from your middle/high school geometry class about the unit circle and you’ll be able to remind yourself thatpositive angles
(正角度>0) are counterclockwise(逆时针
)andnegative angles(负角度<0) are clockwise(顺时针
).
From there we warp the image using the matrix (effectively rotating it) on Line 52.rotated = cv2.warpAffine(image, M, (w, h))
The rotated image is displayed to the screen on Line 52and is shown in Figure 7:
Figure 7:Rotating an image with OpenCV about the center point requires three steps:
(1) compute the center point using the image width and height用宽和高,计算中心坐标
(2) compute a rotation matrix withcv2.getRotationMatrix2D
计算旋转矩阵
(3) use the rotation matrix to warp the image withcv2.warpAffine
. 用旋转矩阵旋转图片
Now let’s perform the same operation in just a single line of code usingimutils :
# rotation can also be easily accomplished via imutils with less code
rotated = imutils.rotate(image, -45)
cv2.imshow("Imutils Rotation", rotated)
cv2.waitKey(0)
Since I don’t have to rotate image as much as resizing them (comparatively) I find the rotation process harder to remember.
Therefore, I created a function in imutils
to handle it for us. In a single line of code, I can accomplish rotating the image 45 degrees clockwise (Line 57) as inFigure 8:
同样,在作者的包imutils
中,提供直接旋转的函数.
Figure 8:Withimutils.rotate
, we can rotate an image with OpenCV and Python conveniently with a single line of code.
At this point you have to be thinking:
Why in the world is the image clipped?(有一部分咋没了)
The thing is, OpenCV_doesn’t care_if our image is clipped and out of view after the rotation. I find this to be quite bothersome, so here’s my imutils
version which will keep the entire image in view. I call it rotate_bound
:
# OpenCV doesn't "care" if our rotated image is clipped after rotation
# so we can instead use another imutils convenience function to help
# us out
rotated = imutils.rotate_bound(image, 45)
cv2.imshow("Imutils Bound Rotation", rotated)
cv2.waitKey(0)
There’s a lot going on behind the scenes of rotate_bound
. If you’re interested in how the method on Line 64works, be sure to check outthis blog post.
emm我有时间再来研究
The result is shown in Figure 9:
Figure 9:Therotate_bound
function ofimutils
will prevent OpenCV from clipping the image during a rotation. Seethis blog postto learn how it works!
Perfect! The entire image is in the frame and it is correctly rotated 45 degrees clockwise.
Smoothing an image
平滑图像处理
In many image processing pipelines, we must blur(模糊) an image to reduce high-frequency noise(高频噪声), making it easier for our algorithms to detect and understand the actual_contents_of the image rather than just _noise_that will “confuse” our algorithms
.
大概就是,模糊化图片可以减少一些不确定因素的影响,可以让大部分函数更好的去处理它周围的像素,如果一个图片很高清,一些像素点可能对整体或者说我们关注的重点,其实作用不但不大,反而起干扰作用,而模糊化(大概让那些特异点和周围趋同了)会减缓这种干扰.
Blurring an image is very easy in OpenCV and there are a number of ways to accomplish it.
Figure 10:This image has undergone(经历) a Gaussian blur
(高斯模糊) with an11 x 11
kernel using OpenCV. Blurring is an important step of many image processing pipelines to reduce high-frequency noise.
I often use the GaussianBlur
function:
高斯模糊 函数
# apply a Gaussian blur with a 11x11 kernel to the image to smooth it, 68
# useful when reducing high frequency noise 69
blurred = cv2.GaussianBlur(image, (11, 11), 0) #70
cv2.imshow("Blurred", blurred) #71
cv2.waitKey(0) #72
On Line 70blurred = cv2.GaussianBlur(image, (11, 11), 0)
we perform a Gaussian Blur with an 11 x 11 kernel
the result of which is shown inFigure 10.
Larger kernels
would yield a more blurry image. Smaller kernels will create less blurry images. To read more about kernels, refer tothis blog postor thePyImageSearch Gurus course.
具体kernels
越大,图片模糊度越高,反之亦然.
Drawing on an image
在图片上画画
In this section, we’re going to draw a rectangle正方形, circle, and line on an input image. We’ll also overlay
覆盖 text on an image as well.
Before we move on with drawing on an image with OpenCV, take note that_drawing operations on images are performed in-place
就地执行的_.
Therefore at the beginning of each code block, we make a copy of the original image storing the copy as output
.
因此在每块代码的开头,我们都保留原始图像的副本,并将副本存储为作为输出
。
We then proceed to draw on the image called output
in-place so we do not destroy our original image.
这样我们绘图时只是在副本上操作,不会破坏原始图像.
Let’s draw a rectangle around Ian Malcolm’s face:
让我们在Ian Malcolm
的脸上画个框
# draw a 2px thick red rectangle surrounding the face
output = image.copy() 74
cv2.rectangle(output, (320, 60), (420, 160), (0, 0,255), 2) #75
cv2.imshow("Rectangle", output) #76
cv2.waitKey(0) #77
First, we make a copy of the image onLine 75cv2.rectangle(output, (320, 60), (420, 160), (0, 0,255), 2)
在75行 复制下图片
for reasons just explained.
Then we proceed to draw the rectangle.
Drawing rectangles in OpenCV couldn’t be any easier. Using pre-calculated coordinates,
用提前计算好的坐标
I’ve supplied the following parameters to the cv2.rectangle
function onLine 76:
已经提供给76行的代码当参数cv2.imshow("Rectangle", output)
-
img
: Thedestination image
to draw upon. We’re drawing onoutput
.
我们要作画的目标图片
-
pt1
: Our starting pixel coordinate which is the top-left. In our case, the top-left is(320,60) .
我们从左上角的第一个像素开始,在本例中坐标为(320,60)
-
pt2
: The ending pixel — bottom-right. The bottom-right pixel is located at (420,160) .
结束与右下角的像.本例中为(420,160).
-
color
: BGR tuple. To represent red, I’ve supplied(0,0,255) .
颜色,比如本例中是红色,表示为(0,0,255)
-
thickness
: Line thickness (a negative value will make a solid rectangle). I’ve supplied a thickness of 2 .
线的粗细(给负数,会画实心矩阵),本例中画的是2px(像素)粗的线.
Since we are using OpenCV’s functions rather than NumPy operations we can supply our coordinates in _(x, y)_order rather than _(y, x)_since we are not manipulating(操作) or accessing the NumPy array directly — OpenCV is taking care of that for us.
我们是使用OpenCV进行操作的,因此不用按照默认的numpy数组储存序列(y,x)进行操作,用我们习惯的(x,y)OpenCV提供
即可.
Here’s our result in Figure 11:
Figure 11:Drawing shapes with OpenCV and Python is an easy skill to pick up. In this image, I’ve drawn a red box usingcv2.rectangle
. I pre-determined the coordinates around the face for this example, but you could use a face detection method to automatically find the face coordinates.
之前的链接,作者教面部识别的
[detect and extract face ROI’s automatically(https://www.pyimagesearch.com...
And now let’s place a solid blue circle in front of Dr. Ellie Sattler’s face:
# draw a blue 20px (filled in) circle on the image centered at 80
# x=300,y=150 81
output = image.copy() #82
cv2.circle(output, (300, 150), 20, (255, 0, 0), -1) #83
cv2.imshow("Circle", output) #84
cv2.waitKey(0) #85
To draw a circle, you need to supply following parameters to cv2.circle
:
画图用这个函数v2.circle
以下步骤走
-
img
: The output image. -
center
: Our circle’s center coordinate. I supplied(300,150) which is right in front of Ellie’s eyes. -
radius
半径: The circle radius in pixels. I provided a value of20 pixels. -
color
: Circle color. This time I went with blue as is denoted by 255 in the B and 0s in the G + R components of the BGR tuple,(255,0,0) . -
thickness
: The line thickness. Since I supplied a negative value (-1 ), the circle is solid/filled in.
Here’s the result in Figure 12:
Figure 12:OpenCV’scv2.circle
method allows you to draw circles anywhere on an image. I’ve drawn a solid circle for this example as is denoted by the-1
line thickness parameter (positive values will make a circular outline with variable line thickness).
It looks like Ellie is more interested in the dinosaurs than my big blue dot, so let’s move on(把我们画的蓝点给挪了)!
Next, we’ll draw a red line. This line goes through Ellie’s head, past her eye, and to Ian’s hand.
If you look carefully at the method parameters and compare them to that of the rectangle, you’ll notice that they are identical基本一致:
# draw a 5px thick red line from x=60,y=20 to x=400,y=200 87
output = image.copy() #88
cv2.line(output, (60, 20), (400, 200), (0, 0, 255), 5) #89
cv2.imshow("Line", output) #90
cv2.waitKey(0) #91
Just as in a rectangle, we supply two points, a color, and a line thickness. OpenCV’s backend
后端 does the rest.
Figure 13shows the result of Line 89from the code block:cv2.line(output, (60, 20), (400, 200), (0, 0, 255), 5)
Figure 13:Similar to drawing rectangles and circles, drawing a line in OpenCV usingcv2.line
only requires a starting point, ending point, color, and thickness.
Oftentimes you’ll find that you want to overlay text on an image for display purposes. If you’re working on face recognition
you’ll likely want to draw the person’s name above their face. Or if you advance in your computer vision career you may build an image classifier
or object detector
. In these cases, you’ll find that you want to draw text containing the class name and probability.
Let’s see how OpenCV’s putText
function works:
# draw green text on the image 93
output = image.copy() #94
cv2.putText(output, "OpenCV + Jurassic Park!!!", (10, 25), #95
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2) #96
cv2.imshow("Text", output) #97
cv2.waitKey(0) #98
The putText
function of OpenCV is responsible for drawing text on an image. Let’s take a look at the required parameters:显示文字需要的参数
-
img
: The output image. -
text
: The string of text we’d like to write/draw on the image. -
pt
起始点: The starting point for the text. -
font
字体: I often use thecv2.FONT_HERSHEY_SIMPLEX
. The available fonts arelisted here. - scale 字大小: Font size multiplier.
- color : Text color.
- thickness : The thickness of the stroke in pixels.
The code onLines 95 and 96will draw the text,_“OpenCV + Jurassic Park!!!”_in green on ouroutput image inFigure 14:
cv2.putText(output, "OpenCV + Jurassic Park!!!", (10, 25),
cv2.FONT_HERSHEY_SIMPLEX, 0.7, (0, 255, 0), 2) #96
Figure 14:Oftentimes, you’ll find that you want to display text on an image for visualization purposes. Using thecv2.putText
code shown above you can practice overlaying text on an image with different colors, fonts, sizes, and locations.
Running the first OpenCV tutorial Python script
运行第一个python 脚本 OpenCV教程吧
In my blog posts, I generally provide a section detailing how you can run the code on your computer. At this point in the blog post, I make the following assumptions:
假设以下条件你都满足
- You have downloaded the code from the _“Downloads” _section of this blog post.
- You have unzipped the files.
- You have installed OpenCV and the imutils library on your system.
To execute our first script, open a terminal or command window and navigate to the files or extract them if necessary.
From there, enter the following command:
$ python opencv_tutorial_01.py
width = 600, height = 322, depth =3
R=41, G=49, B=37
The command is everything after the bash prompt $
character.
Just type python opencv_tutorial_01.py
in your terminal and then the first image will appear.
先在你的命令框中输入python opencv_tutorial_01.py
To cycle through each step that we just learned, make sure an image window is active, and press any key.
Our first couple code blocks above told Python to print information in the terminal. If your terminal is visible, you’ll see the terminal output (Lines 2 and 3) shown.
I’ve also included a GIF animation demonstrating all the image processing steps we took sequentially, one right after the other:
Figure 15:Output animation displaying the OpenCV fundamentals we learned from this first example Python script.
Counting objects
数物体
Now we’re going to shift gears换挡 and work on the second script included in the_“Downloads”_associated with this blog post.作者有提供代码
In the next few sections we’ll learn how to use create a simple Python + OpenCV script to count the number of Tetris
俄罗斯方块 blocks in the following image:
Figure 16:If you’ve ever played Tetris (who hasn’t?), you’ll recognize these familiar shapes. In the 2nd half of this OpenCV fundamentals tutorial, we’re going to find and count the shape contours.
Along the way we’ll be:
- Learning how to convert images to grayscale with OpenCV--使用OpenCV将图像转换为灰度
- Performing
edge detection
边缘检测 - Thresholding a grayscale image通过阈值转化得到灰度图像(二值化)
- Finding, counting, and drawing contours
- Conducting
erosion
腐蚀 anddilation
扩张 - Masking an image
Go ahead and close the first script you downloaded and open upopencv_tutorial_02.py to get started with the second example:
# import the necessary packages 1
import argparse #2
import imutils #3
import cv2 #4
#5
# construct the argument parser and parse the arguments 6
ap = argparse.ArgumentParser() #7
ap.add_argument("-i", "--image", required =True, #8
help="path to input image") #9
args = vars(ap.parse_args()) #10
On Lines 2-4we import our packages. This is necessary at the start of each Python script. For this second script, I’ve imported argparse
— a command line arguments parsing package which comes with all installations of Python.
Take a quick glance at Lines 7-10. These lines allow us to provide additional information to our program at runtime from within the terminal.
为我们在允许工程是提供额外的信息
Command line arguments are used heavily on the PyImageSearch blog and in all other computer science fields as well.
作者的建议阅读
I encourage you to read about them on this post: _Python, argparse, and command line arguments._
emmm也等我有时间来翻译一遍
We have one required command line argument --image , as is defined on Lines 8 and 9.
ap.add_argument("-i", "--image", required =True,
help ="path to input image")
We’ll learn how to run the script with the required command line argument down below. For now, just know that wherever you encounter args["image"]
in the script, we’re referring to the path to the input image.
你只要输入 args["image"]
,代表输入图片的路径
Converting an image to grayscale
转成灰度图
# load the input image (whose path was supplied via command line 12
# argument) and display the image to our screen 13
image = cv2.imread(args["image"]) #14
cv2.imshow("Image", image) #15
cv2.waitKey(0) #16
# convert the image to grayscale #18
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)#19
cv2.imshow("Gray", gray) #20
cv2.waitKey(0) #21
We load the image into memory on Line 14.加载图片image = cv2.imread(args["image"])
The parameter to the cv2.imread
function is our path contained in theargs dictionary referenced with the"image" key, args["image"] .
From there, we display the image until we encounter our first keypress (Lines 15 and 16).
我们一直展示图片,直到你按下一个键(然后图片就没了)
We’re going to be thresholding
二值化 and detecting edges
边缘检测 in the image shortly. Therefore we convert the image to grayscale on Line 19by calling cv2.cvtColor
and providing the image
and cv2.COLOR_BGR2GRAY
flag.
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
Again we display the image and wait for a keypress (Lines 20 and 21).
The result of our conversion to grayscale is shown in Figure 17(_bottom)._
Figure 17:(_top_) Our Tetris image. (_bottom_) We’ve converted the image to grayscale — a step that comes before thresholding.
Edge detection
边缘检测
Edge detection is useful for finding boundaries of objects in an image — it is effective for segmentation分隔 purposes.
Let’s perform edge detection to see how the process works:
# applying edge detection we can find the outlines of objects in #23
# images 24
edged = cv2.Canny(gray, 30, 150) #25
cv2.imshow("Edged", edged) #26
cv2.waitKey(0) #26
Using the popular Canny algorithm
(developed by John F. Canny in 1986), we can find the edges in the image.
We provide three parameters to the cv2.Canny
function:
-
img
: Thegray
image. -
minVal
: A minimumthreshold
临界值, in our case30
. -
maxVal
: The maximum threshold which is150
in our example. -
aperture_size
核大小? : The Sobel kernel size. By default this value is3
and hence is not shown on Line 25.(因为是默认值,所以我们没有展示)
edged = cv2.Canny(gray, 30, 150)
Different values for the minimum and maximum thresholds will return different edge maps.
不同的最大最小阈值,将得到不同的边缘映射
InFigure 18below, notice how edges of Tetris blocks themselves are revealed along with sub-blocks that make up the Tetris block:
Figure 18:To conduct edge detection with OpenCV, we make use of the Canny algorithm.
Thresholding
二值化
Image thresholding is an important intermediary step for image processing pipelines. Thresholding can help us to remove lighter or darker regions and contours of images.
I highly encourage you to experiment with thresholding. I tuned the following code to work for our example by trial and error (as well as experience):
# threshold the image by setting all pixel values less than 225 row29
# to 255 (white; foreground) and all pixel values >= 225 to 255 row30
# (black; background), thereby segmenting the image row31
thresh = cv2.threshold(gray, 225, 255, cv2.THRESH_BINARY_INV)[1] #32
cv2.imshow("Thresh", thresh) #33
cv2.waitKey(0) #34
In a single line (Line 32) we are:
thresh = cv2.threshold(gray, 225, 255, cv2.THRESH_BINARY_INV)[1]
- Grabbing all pixels in the
gray
image greater than 225 and setting them to 0 (black) which corresponds to thebackground
of the image
把所有像素值大于255的设置为0(代表黑),对应之前的背景
- Setting pixel vales less than 225 to 255 (white) which corresponds to the foreground of the image 把所有像素值小于255的设置为255(代表白)(i.e., the Tetris blocks themselves).
For more information on the cv2.threshold
function, including how the thresholding flags work, be sure to refer toofficial OpenCV documentation.
看上面的官方文档深入理解函数的作用.
Segmenting foreground from background with a binary image is_critical_to finding contours (our next step).
将(白)前景
(?)与(黑)背景
相分离就可以数俄罗斯方块数了
Figure 19:Prior to finding contours, 数俄罗斯方块前we threshold the grayscale image先将图片二值化. We performed a binary inverse threshold so that the foreground shapes become white while the background becomes black.
Notice in Figure 19that the foreground objects are white and the background is black.
Detecting and drawing contours
Figure 20:We’re working towards finding contour shapes with OpenCV and Python in this OpenCV Basics tutorial.
Pictured in the Figure 20animation动画, we have 6 shape contours. Let’s find and draw their outlines via code:
# find contours (i.e., outlines) of the foreground objects in the 36
# thresholded image 37
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL, #38
cv2.CHAIN_APPROX_SIMPLE) #39
cnts = imutils.grab_contours(cnts) #40
output = image.copy() #41
#42
# loop over the contours 43
for c in cnts: 44
# draw each contour on the output image with a 3px thick purple 45
# outline, then display the output contours one at a time 46
cv2.drawContours(output, [c], -1, (240, 0, 159), 3) #47
cv2.imshow("Contours", output) #48
cv2.waitKey(0) #49
On Lines 38 and 39
cnts = cv2.findContours(thresh.copy(), cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_SIMPLE)
we use cv2.findContours
to detect the contours in the image. Take note of the parameter flags but for now let’s keep things simple — our algorithm is finding all foreground (white) pixels in the thresh.copy()
image.
Line 40cnts = imutils.grab_contours(cnts)
is very important accounting for the fact that cv2.findContours
implementation changed between OpenCV 2.4, OpenCV 3, and OpenCV 4. This compatibility line is present on the blog wherever contours are involved.
We make a copy of the original image on Line 41so that we can draw contours on subsequent Lines 44-49.
On Line 47we draw eachc from the cnts
list on the image using the appropriately named cv2.drawContours
. I chose purple which is represented by the tuple(240,0,159) .
Using what we learned earlier in this blog post, let’s overlay some text on the image:
在上面写点字
# draw the total number of contours found in purple 51
text = "I found {} objects!".format(len(cnts)) #52
cv2.putText(output,text,(10,25),cv2.FONT_HERSHEY_SIMPLEX, 0.7,#53
(240, 0, 159), 2)#54
cv2.imshow("Contours", output) #55
cv2.waitKey(0) #56
Line 52builds a text
string containing the number of shape contours. Counting the total number of objects in this image is as simple as checking the length of the contours list — len(cnts)
.
The result is shown in Figure 21:
Figure 21:Counting contours with OpenCV is as easy as finding them and then callinglen(cnts)
.
Erosions and dilations
Erosions and dilations are typically used to reduce noise in binary images (a side effect of thresholding).用来在二值化图中减少噪声影响
To reduce the size of foreground objects we can erode away pixels given a number of iterations
迭代:
# we apply erosions to reduce the size of foreground objects 58
mask = thresh.copy() #59
mask = cv2.erode(mask, None, iterations=5) #60
cv2.imshow("Eroded", mask) #61
cv2.waitKey(0) #62
On Line 59we copy the thresh
image while naming it mask
.
Then, utilizing cv2.erode
, we proceed to reduce the contour sizes with 5iterations (Line 60).mask = cv2.erode(mask, None, iterations= 5)
Demonstrated in Figure 22, the masks generated from the Tetris contours are slightly smaller:
Figure 22:Using OpenCV we can erode contours, effectively making them smaller or causing them to disappear completely with sufficient iterations. This is typically useful for removing small blobs in mask image.
Similarly, we can foreground
regions in the mask. To enlarge the regions, simply usecv2.dilate :
放大白色部分called foreground
#similarly, dilations can increase the size of the ground objects 64
mask = thresh.copy() #65
mask = cv2.dilate(mask, None, iterations=5) #66
c2.imshow("Dilated", mask) #67
cv2.waitKey(0) #68
Figure 23:In an image processing pipeline if you ever have the need to connect nearby contours, you can apply dilation to the image. Shown in the figure is the result of dilating contours with five iterations, but not to the point of two contours becoming one.注意不要让两个图形重叠了
Masking and bitwise operations
Masks allow us to “mask out” regions of an image we are uninterested in. We call them “masks” because they will _hide_regions of images we do not care about.
If we use the thresh image from Figure 18and mask it with the original image, we’re presented with Figure 23:
Figure 24: When using the thresholded image as the mask in comparison to our original image, the colored regions reappear as the rest of the image is “masked out”. This is, of course, a simple example, but as you can imagine, masks are very powerful.
In Figure 24, the background is black now and our foreground consists of colored pixels — any pixels masked by ourmask image.
Let’s learn how to accomplish this:
# a typical operation we may want to apply is to take our mask and 70
# apply a bitwise AND to our input image, keeping only the masked 71
# regions 72
mask = thresh.copy() #73
output = cv2.bitwise_and(image, image, mask=mask)#74
cv2.imshow("Output", output) #75
cv2.waitKey(0) #76
The mask
is generated by copying the binary thresh
image (Line 73).mask = thresh.copy()
From there we bitwise
逐位 AND the pixels from both images together using cv2.bitwise_and
.
The result is Figure 24above where now we’re only showing/highlighting the Tetris blocks.
Running the second OpenCV tutorial Python script
运行第二个脚本吧...
To run the second script, be sure you’re in the folder containing your downloaded source code and Python scripts. From there, we’ll open up a terminal provide the script name +command line argument:
python opencv_tutorial_02.py --image tetris_blocks.png
The argument flag is --image
and the image argument itself is tetris_blocks.png
— a path to the relevant file in the directory.
There is no terminal output for this script. Again, to cycle through the images, be sure you click on an image window to make it active, from there you can press a key and it will be captured to move forward to the nextwaitKey(0) in the script. When the program is finished running, your script will exit gracefully and you’ll be presented with a new bash prompt line in your terminal.
Below I have included a GIF animation of the basic OpenCV image processing steps in our example script:
Figure 25:Learning OpenCV and the basics of computer vision by counting objects via contours.
Where can I learn more?
If you’re looking to continue learning OpenCV and computer vision, be sure to take a look at my book, _Practical Python and OpenCV._作者推荐自己的数=书
Inside the book we’ll explore the OpenCV fundamentals we discussed here today in more detail.
You’ll also learn how to use these fundamentals to build actual computer vision + OpenCV applications, including:
- Face detection in images and video
- Handwriting recognition
- Feature extraction and machine learning
- Basic object tracking
- …and more!
To learn more about_Practical Python and OpenCV_, and how it can help you learn OpenCV (in less than a single weekend),just click here.
Summary
In today’s blog post you learned the fundamentals of image processing and OpenCV using the Python programming language.
You are now prepared to start using these image processing operations as “building blocks” you can chain together to build an actual computer vision application — a great example of such a project is the basic object counter we created by counting contours.
I hope this tutorial helped you learn OpenCV!
To be notified when future OpenCV blog posts are published here on PyImageSearch,_just enter your email address in the form below!_