Anaconda的下载和安装无需多言,注意路径添加即可。环境的配置见此博文。
关键步骤如下图:
推荐网站:http://www.image-net.org/。本身是一个图像识别比赛,也包含了大量的图片数据集。
学习方法:类比,用熟悉的知识带动陌生的知识。28技巧:知识中有20%是重要的,80%是不重要的,要用80%的时间学习那20%的知识。
import tensorflow as tf
import cv2
hello = tf.constant ('hello world!')# tensorflow常量的定义
sess = tf.Session()# session会话的定义
print (sess.run(hello))# 打印时一定要有sess.run(内容)
重要提示:
import cv2
img = cv2.imread('image0.jpg',1)# 参数1代表图片文件名;参数2代表以彩色
# 图片读入,若为0则以灰度图片读入
cv2.imshow('image',img)# 参数1是窗口名称,参数2是图片变量
cv2.waitKey(0)# 防止图片一闪而过的等待
core模块非常重要,包括矩阵、绘图等基本内容。
highgui模块重要,但主要学习其API。
imgcodecs和imgproc模块非常重要,描述图像处理相关东西,包括图像滤波器,直方图统计,图像均衡化,几何变换,颜色处理等。
ml(machine learning)模块重要,机器学习相关。
photo模块重要,图片的修复,去噪。
3个video模块涉及视频处理。
import cv2
img = cv2.imread('image.jpg',1)
cv2.imwrite('image1.jpg',img)# 参数1是写入的文件名,参数2是写入的图片数据
import cv2
img = cv2.imread('image.jpg',1)
cv2.imwrite('imagetest.jpg',img,[cv2.IMWRITE_JPEG_QUALITY,50])
# 参数值范围为0-100,值越小图片大小越小,质量越差
img = cv2.imread('image.jpg',1)
cv2.imwrite('imagetest.png',img,[cv2.IMWRITE_PNG_COMPRESSION,9])
# 参数值范围0-9,值越小图片大小越大,质量却不变,因png为无损压缩
效果:
重要提示:
import cv2
img = cv2.imread('image0.jpg',1)# OpenCV读入的彩色图片时BGR格式
(b,g,r) = img[100,100]
print ((b,g,r))
for i in range(100):
img[100+i,100] = (255,0,0)# 修改制定位置的像素
cv2.imwrite('image5.jpg',img)
cv2.imshow('image',img)
cv2.waitKey(0)# 括号中为延迟的毫秒数
进行类比学习,学习tf的基本语法,api的调用,原理等。可参考python的基础数据类型、运算符、流程、字典、数组等。
import tensorflow as tf
data1 = tf.constant(2,dtype=tf.int32)# 指定数据类型
data2 = tf.Variable(10,name='var')# 注意变量定义时命令首字母要大写,可以给变量起名字
sess = tf.Session()
print (sess.run(data1))
# 对于变量的使用,要先进行变量的初始化
init = tf.global_variables_initializer()# 创建全部的变量初始化指令
sess.run(init)# 运行初始化操作制定
print (sess.run(data2))
重要提示:
TensorFlow本质上由张量tensor+计算图graphs两部分构成。上节中的data1和data2都是tensor,就是数据、数组,可以是常量或变量。op是operation,包括四则运算、赋值等操作。下图中的tensor和op即结果共同构成了一个计算图graphs。计算图graphs就是数据和操作的过程。TensorFlow中所有的计算图都要放入session会话中执行,session理解为运算的交互环境。
所有的变量都要经过init初始化操作才能使用,而init仍是计算图,仍要放入session
import tensorflow as tf
data2 = tf.Variable(10,name='var')# 注意变量定义时命令首字母要大写,可以给变量起名字
init = tf.global_variables_initializer()
with tf.Session() as sess:# 效仿了python中打开文件时的with操作
sess.run(init)
print (sess.run(data2))
重要提示:
import tensorflow as tf
# 常量间的四则运算
data1 = tf.constant(6)
data2 = tf.constant(2)
dataadd = tf.add(data1,data2)
datasub = tf.subtract(data1,data2)
datamul = tf.multiply(data1,data2)
datadiv = tf.divide(data1,data2)
with tf.Session() as sess:
print (sess.run(dataadd))
print (sess.run(datasub))
print (sess.run(datamul))
print (sess.run(datadiv))
print ('end')
# 变量与常量的四则运算
data1 = tf.constant(6)
data2 = tf.Variable(2)
dataadd = tf.add(data1,data2)
datacopy = tf.assign(data2,dataadd)# 将dataadd的数据追加到data2中,再赋给datacopy
# 注意此句话需要在session中运行后才生效。
datasub = tf.subtract(data1,data2)
datamul = tf.multiply(data1,data2)
datadiv = tf.divide(data1,data2)
init = tf.global_variables_initializer()# 不要忘了括号
with tf.Session() as sess:
sess.run(init)
print (sess.run(dataadd))
print (sess.run(datasub))
print (sess.run(datamul))
print (sess.run(datadiv))
print ('sess.run(datacopy)',sess.run(datacopy))
print ('datacopy.eval()',datacopy.eval())
# .eval()是除了sess.run()的另一种方法,实质上和下行方法等价。
print ('tf.get_default_session.run(datacopy)',tf.get_default_session().run(datacopy))
print ('end')
重要提示:
import tensorflow as tf
# placeholder用法
data1 = tf.placeholder(tf.float32)
data2 = tf.placeholder(tf.float32)
dataadd = tf.add(data1,data2)
with tf.Session() as sess:
print (sess.run(dataadd,feed_dict={
data1:6,data2:2}))
print ('end')
# placeholder和feed_dict是配合使用的,placeholder相当于定义了一个空的量
# 后续需要通过feed_dict进行追加他的值。feed_dict是一个字典,内部存储了参与此步
# op的量的名称及赋值。
# 矩阵操作
data1 = tf.constant([[6,6]])# 一行两列
data2 = tf.constant([[2],[2]])# 两行一列
data3 = tf.constant([[3,3]])
data4 = tf.constant([[1,2],[3,4],[5,6]])
print (data4.shape)
with tf.Session() as sess:
print (sess.run(data4))# 打印全部
print (sess.run(data4[0]))# 打印第一行
print (sess.run(data4[:,0]))# 打印第一列
print (sess.run(data4[0,0]))# 打印第一行第一列元素
print ('end')
重要提示:
import tensorflow as tf
# 矩阵乘法和数组乘法
data1 = tf.constant([[6,6]])# 一行两列
data2 = tf.constant([[2],[2]])# 两行一列
data3 = tf.constant([[3,3]])
data4 = tf.constant([[1,2],[3,4],[5,6]])
matmul = tf.multiply(data1,data2)# 数组的广播功能
matadd = tf.add(data1,data2)
matmul1 = tf.matmul(data1,data2)# 矩阵的乘法
with tf.Session() as sess:
print (sess.run(matmul))
print (sess.run(matadd))
print (sess.run(matmul1))
print (sess.run([matmul,matadd,matmul1]))# 一次打印多个量,用中括号括起来
print ('end')
重要提示:
import tensorflow as tf
mat0 = tf.zeros([2,3])
mat1 = tf.ones([3,2])
mat2 = tf.fill([2,3],15)# 以任意数填充形状为2*3的矩阵
mat3 = tf.constant([[2],[3],[4]])
mat4 = tf.zeros_like(mat3)# 创建一个形状同mat3的全零矩阵
mat5 = tf.linspace(0.0,2.0,10)# 区别于numpy,tf中的linspace的区间是浮点数,而numpy可以是整数
mat6 = tf.random_uniform([2,3],-1,2)# 以-1到2间的随机数填充2*3的数组
with tf.Session() as sess:
print (sess.run(mat0))
print (sess.run(mat1))
print (sess.run(mat2))
print (sess.run(mat3))
print (sess.run(mat4))
print (sess.run(mat5))
print (sess.run(mat6))
重要提示:
点击参考完整内容
点击参考完整内容
本章学习图片的缩放、剪切、位移、镜像、仿射变换操作。
import cv2
import numpy as np
# API的调用实现图片缩放
img = cv2.imread('image0.jpg',1)
iminfo = img.shape# 行数,列数,深度
print (iminfo)
height = iminfo[0]
width = iminfo[1]
mode = iminfo[2]# 描述图片的颜色组成方式
# 等比例缩放
dstheight = int(height*0.5)
dstwidth = int(width*0.5)
dst = cv2.resize(img,(dstwidth,dstheight))# 坐标中的xy值
# 注意此处赋值,先赋宽后赋高,和输出图片的形状时顺序不同
cv2.imshow('image',dst)
cv2.imshow('img',img)
cv2.waitKey(0)
# 通过最近临域插值原理的源码实现图片的缩放
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
dstheight = int(height/2)
dstwidth = int(width/2)
dstimage = np.zeros((dstheight,dstwidth,3),np.uint8)
# 定义一个宽高3通道的空的三维数组,且数组元素为0-255,即int8
for i in range(dstheight):# i、j分别代表目标的行列
for j in range(dstwidth):
inew = int(i*(height/dstheight))
jnew = int(j*(width/dstwidth))
# inew、jnew分别代表原图的行列
dstimage[i,j] = img[inew,jnew]
cv2.imshow('dst',dstimage)
cv2.waitKey(0)
重要提示:
5. 双线性插值法原理:以四个蓝线交点和a1a2b1b2的位置距离作为权重百分比,算得a1a2b1b2的像素值(例如a1的值通过(15,22)和(16,22)算得)。再通过a1a2b1b2四个点与绿橙交点的位置距离作为权重百分比算得该点像素值。
6. API的方法指的就是我们的命令,而源码则值的是API背后的算法。
7. 只有resize调用时使用的是坐标点xy的顺序,而其他处理矩阵的都是用行数列数深度的顺序(例如.shape和下节中取数组的指定区域元素)。
import cv2
img = cv2.imread('image0.jpg',1)
dst = img[100:200,100:300]# 行,列
cv2.imshow('dst',dst)
cv2.waitKey(0)
效果:
重要提示:
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
cv2.imshow('scr',img)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
# API的形式实现
matshift = np.float64([[1,0,100],[0,1,200]])# 生成2*3浮点型数组
# 延x轴方向移动100,延y方向移动200,延哪个轴哪个轴位置为1
dst = cv2.warpAffine(img,matshift,(height,width))
cv2.imshow('dst',dst)
cv2.waitKey(0)
'''
# 源码形式实现
dst1 = np.zeros(imginfo,np.uint8)
for i in range(height-200):
for j in range(width-100):
dst1[i+200,j+100]=img[i,j]
cv2.imshow('image',dst1)
cv2.waitKey(0)
'''
效果:
重要提示:
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
deep = imginfo[2]
newimginfo = (height*2,width,deep)# 多了深度信息
dst = np.zeros(newimginfo,np.uint8)# 注意是np.uint8是255的
for i in range(height):
for j in range(width):
dst[i,j] = img[i,j]
dst[height*2-i-1,j] = img[i,j]# 索引和行数的区别
for i in range(width):
dst[height,i] = (0,0,255)# BGR格式
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
matscale = np.float64([[0.5,0,0],[0,0.5,0]])
dst = cv2.warpAffine(img,matscale,(int(width/2),int(height/2)))
cv2.imshow('img',img)
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
cv2.imshow('scr',img)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
matscr = np.float32([[0,0],[0,height-1],[width-1,0]])
matdst = np.float32([[50,50],[300,height-200],[width-300,100]])
# 取原图和目标图像对应的3个点坐标,分别为左上,左下,右上
mataffine = cv2.getAffineTransform(matscr,matdst)
# 得到矩阵组合,得到原矩阵在新矩阵上对应的点
dst = cv2.warpAffine(img,mataffine,(width,height))
# cv2.warpAffine仿射变换方法
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
iminfo = img.shape
height = iminfo[0]
width = iminfo[1]
matrotate = cv2.getRotationMatrix2D((height*0.5,width*0.5),45,0.5)
# 参数分别为:旋转中心点,角度,缩放比例。得到仿射矩阵。
dst = cv2.warpAffine(img,matrotate,(width,height))
cv2.imshow('img',img)
cv2.imshow('dst',dst)
cv2.waitKey(0)
效果:
重要提示:
灰度处理,地板效果,马赛克,毛玻璃效果,图像融合,图片蓝色,边缘检测,浮雕效果。从三个角度来学习:1、API的调用。2、算法原理。3、源码实现。
方法一:直接读入灰度图片
import cv2
# 使用cv2.imread()灰度化
img0 = cv2.imread('image0.jpg',0)
img1 = cv2.imread('image0.jpg',1)
print (img0.shape)# 灰度图片是单通道
print (img1.shape)# 彩色图片3通道
cv2.imshow('scr',img0)
cv2.waitKey(0)
import cv2
img = cv2.imread('image0.jpg',1)
dst = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# 颜色空间转换的命令,参数2表示当前颜色类型转为指定类型
cv2.imshow('gray',dst)
cv2.waitKey(0)
重要提示:
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
# 灰色图片颜色反转
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
dst = np.zeros((height,width,1),np.uint8)# 图片维度,及定义无符号8位整型
for i in range(height):
for j in range(width):
graypixel = gray[i,j]
dst[i,j] = 255-graypixel
cv2.imshow('dst-gray',dst)
# 彩色图片颜色反转
dst = np.zeros((height,width,3),np.uint8)
for i in range(height):
for j in range(width):
(b,g,r) = img[i,j]
dst[i,j] = (255-b,255-g,255-r)
cv2.imshow('dst-color',dst)
cv2.waitKey(0)
重要提示:
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
# 外两层for循环旨在选取图片中需大码的矩形区域
for m in range(100,300):
for n in range(100,200):
#pixel ->10*10,将10*10小矩阵中各像素用该小矩阵的首个像素替代
if m%10 == 0 and n%10 == 0:
(b,g,r) = img[m,n]
for i in range(10):
for j in range(10):
img[m+i,n+j] = img[m,n]
cv2.imshow('dst',img)
cv2.waitKey(0)
import cv2
import numpy as np
import random
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
dst = np.zeros((height,width,3),np.uint8)
m = 8
for i in range(height-m):# 减去m是为了防止溢出
for j in range(width-m):
index = int(random.random()*8)
# 产生0-1间的随机数并乘8即可产生0-8间随机数,再强制转换为整型即产生0-8间整型随机数
dst[i,j] = img[i+index,j+index]
cv2.imshow('毛玻璃',dst)
cv2.waitKey(0)
import cv2
import numpy as np
img0 = cv2.imread('image0.jpg',1)
img1 = cv2.imread('image1.jpg',1)
imginfo = img0.shape
height = imginfo[0]
width = imginfo[1]
# 定义抠图区域
roih = int(height/2)
roiw = int(width/2)
img0roi = img0[0:0+roih,0:0+roiw]# 注意定义的roi不可超过两张图片的边界
img1roi = img1[0:0+roih,0:0+roiw]
# 融合
dst = np.zeros((roih,roiw,3),np.uint8)
dst = cv2.addWeighted(img0roi,0.7,img1roi,0.3,0)
# 对两个数组求权重后相加。另外注意该命令不要求0.7+0.3=1,而是图片融合有此要求。
# 即dst = scr0*a + scr1*(1-a)
cv2.imshow('dst',dst)
cv2.waitKey(0)
方法一:canny边缘检测
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
cv2.imshow('scr',img)
# Canny边缘检测的3个步骤,1灰度化,2高斯滤波,3canny检测
# canny实际就是个卷积运算
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
imgg = cv2.GaussianBlur(gray,(3,3),0)# 高斯滤波,除去噪声点
dst = cv2.Canny(img,50,50)
# 后两个参数是图片门限,如果图片某部分经卷积运算后超过门限,则是边缘点
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
import math
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
cv2.imshow('scr',img)
# sobel边缘检测的源码实现步骤,1算子模板,2图片卷积,3阈值判决
# [1 2 1 [1 0 -1
# 0 0 0 2 0 -2
# -1 -2 -1] 1 0 -1]
# 分别以图像中每个像素点为左上角,构建一个3*3小矩阵,然后让该矩阵分别于上述两个算子相乘
# 得到两个值a,b,在使用sqrt(a**+b**)?>th阈值,如果大于阈值则赋为白色,即为边缘
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
dst = np.zeros((height,width,1),np.uint8)
for i in range(height-2):
for j in range(width-2):
gy = gray[i,j]*1 + gray[i,j+1]*2 + gray[i,j+2]*1 - gray[i+2,j]*1 - gray[i+2,j+1]*2 - gray[i+2,j+2]*1
gx = gray[i,j]*1 - gray[i,j+2]*1 + gray[i+1,j]*2 - gray[i+1,j+2]*2 + gray[i+2,j]*1 - gray[i+2,j+2]*1
grad = math.sqrt(gy*gy+gx*gx)
if grad > 50:
dst[i,j] = 255
else:
dst[i,j] = 0
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
import math
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
cv2.imshow('scr',img)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
sobel_img = cv2.Sobel(gray, cv2.CV_16S, 1, 0, ksize=3)#sobel算子边缘检测,前提是灰度图像,即单通道。1代表x方向有一次求导运算
# 0代表y方向上有0次求导运算,ksize=3代表sobel算子为3*3的核(滤波器),做卷积运算后大于阈值的点会赋值255白,否则赋值为0黑
# Sobel函数求完导数后会有负值,还有会大于255的值。而原图像是uint8,即8位无符号数,所以Sobel建立的图像位数不够,会有截断。因此要使用16位有符号的数据类型,即cv2.CV_16S
# x方向求导次数越高,则竖直方向上的边缘越多被检测
sobel_img = cv2.convertScaleAbs(sobel_img)
# 在经过处理后,用convertScaleAbs()函数将其转回原来的uint8形式。否则将无法显示图像,而只是一副灰色的窗口。
cv2.imshow('dst',sobel_img)
cv2.waitKey(0)
效果:
重要提示:
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
dst = np.zeros((height,width,1),np.uint8)
for i in range(height):
for j in range(width-1):
grayp0 = int(gray[i,j])
grayp1 = int(gray[i,j+1])
newp = grayp0 - grayp1 + 150# 核心计算公式
if newp>255:
newp = 255
elif newp<0:
newp = 0
dst[i,j] = newp
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
dst = np.zeros((height,width,3),np.uint8)
for i in range(height):
for j in range(width):
(b,g,r) = img[i,j]
b = b*1.5
g = g*1.3
if b>255:
b = 255
if g>255:
g = 255
dst[i,j] = (b,g,r)
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
img = cv2.imread('image00.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
cv2.imshow('scr',img)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
dst = np.zeros((height,width,3),np.uint8)
for i in range(4,height-4):
for j in range(4,width-4):
array1 = np.zeros(8,np.uint8)
for m in range(-4,4):
for n in range(-4,4):
p1 = int(gray[i+m,j+n]/32)
array1[p1] = array1[p1]+1
currentmax = array1[0]
l = 0
for k in range(0,8):
if currentmax<array1[k]:
currentmax = array1[k]
l = k
for m in range(-4,4):
for n in range(-4,4):
if gray[i+m,j+n]>=(l*32) and gray[i+m,j+n]<=(l+1)*32:
(b,g,r) = img[i+m,j+n]
dst[i,j] = (b,g,r)
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
newimageinfo = (500,500,3)
dst = np.zeros(newimageinfo,np.uint8)
# 绘制line,参数1目标图像数据,参数2起始位置,参数3终止位置,
# 参数4颜色,参数4线宽,参数4线条始末端边光滑
cv2.line(dst,(100,100),(400,400),(0,0,255))
cv2.line(dst,(100,200),(400,200),(0,255,255),20)
cv2.line(dst,(100,300),(400,300),(0,255,0),20,cv2.LINE_AA)
# 用line绘制三角形
cv2.line(dst,(200,150),(50,250),(25,100,255),10)
cv2.line(dst,(50,250),(400,380),(25,100,255),10)
cv2.line(dst,(400,380),(200,150),(25,100,255),10)
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
newimageinfo = (500,500,3)
dst = np.zeros(newimageinfo,np.uint8)
# 绘制rectangle,
# 参数1原图信息,参数2左上角,参数3右下角,参数4颜色,
# 参数5是否填充,如果为-1则填充,如果为正数,则不填充并表示线条宽度
cv2.rectangle(dst,(50,100),(200,300),(255,0,0),-1)
# 绘制circle。参数1原图,参数2圆心,参数3半径,参数4颜色,参数5填充与否
cv2.circle(dst,(250,250),(50),(0,255,0),2)
# 绘制ellipse,参数2椭圆圆心,参数3两个轴的长度,参数4偏转角度,参数5起始角度,
# 参数6终止角度,参数7颜色,参数8填充与否
cv2.ellipse(dst,(256,256),(150,100),0,0,180,(255,255,0),-1)
# 绘制任意多边形
points = np.array([[150,50],[140,140],[200,170],[250,250],[150,50]],np.uint8)
print (points.shape)# 5*2数组
points = points.reshape((-1,1,2))# 完成数组形状的重构,也就是维数的重构
print (points.shape)# 5*1*2数组
cv2.polylines(dst,[points],True,(0,255,255))
cv2.imshow('dst',dst)
cv2.waitKey(0)
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
font = cv2.FONT_HERSHEY_SIMPLEX# 定义字体
cv2.rectangle(img,(200,100),(500,400),(0,255,0),3)
# 文字的绘制。参数1图片数据,参数2文字内容,参数3写入坐标,参数4字体,参数5字体大小,
# 参数6字体颜色,参数7字体粗细,参数8线条类型
cv2.putText(img,'this is a flower',(100,300),font,1,(200,100,255),2,cv2.LINE_AA)
# 图片绘制
height = int(img.shape[0]*0.2)
width = int(img.shape[1]*0.2)
imgresize = cv2.resize(img,(width,height))
for i in range(height):
for j in range(width):
img[i+420,j+500] = imgresize[i,j]
cv2.imshow('src',img)
cv2.waitKey(0)
本节主要涉及一下方面:直方图、直方图均衡化、亮度增强、磨皮美白、图片滤波、高斯滤波。
import cv2
import numpy as np
def imagehist(image,type):
color = (255,255,255)
windowname = 'gray'
if type == 31:
color = (255,0,0)
windowname = 'b hist'
elif type ==32:
color = (0,255,0)
windowname = 'g hist'
elif type == 33:
color = (0,0,255)
windowname = 'r hist'
hist = cv2.calcHist([image],[0],None,[256],[0.0,255.0])
# 参数1图片数据,参数2指定图片通道,参数3蒙版,参数4直方图的256中颜色情况,
# 参数5表示遍历像素的区间为0-255均遍历统计
minv,maxv,minl,maxl = cv2.minMaxLoc(hist)
# 此命令可获取指定直方图的最小最大值,最小最大坐标
histimg = np.zeros([256,256,3],np.uint8)
for h in range(256):
intennormal = int(hist[h]*256/maxv)
cv2.line(histimg,(h,256),(h,256-intennormal),color)
cv2.imshow(windowname,histimg)
return histimg
img = cv2.imread('image0.jpg',1)
channels = cv2.split(img)# 将RGB通道分解为R、G、B三个单独通道
for i in range(3):
imagehist(channels[i],31+i)
cv2.waitKey(0)
灰度直方图均衡化:
import cv2
import numpy as np
# 灰度直方图均衡化
img = cv2.imread('image0.jpg',1)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
cv2.imshow('scr',gray)
dst = cv2.equalizeHist(gray)
cv2.imshow('dst',dst)
cv2.waitKey(0)
效果:
彩色直方图均衡化:
import cv2
import numpy as np
# 彩色图像直方图均衡化
img = cv2.imread('image0.jpg',1)
cv2.imshow('scr',img)
(b,g,r) = cv2.split(img)# 图像按通道分解
bh = cv2.equalizeHist(b)
gh = cv2.equalizeHist(g)
rh = cv2.equalizeHist(r)
result = cv2.merge((bh,gh,rh))# 通道合成图像
cv2.imshow('dst',result)
cv2.waitKey(0)
效果
YUV直方图均衡化:
import cv2
import numpy as np
# YUV直方图均衡化
img = cv2.imread('image0.jpg',1)
imgYUV = cv2.cvtColor(img,cv2.COLOR_BGR2YCrCb)
cv2.imshow('scr',img)
channelYUV = cv2.split(imgYUV)
channelYUV[0] = cv2.equalizeHist(channelYUV[0])
channels = cv2.merge(channelYUV)
result = cv2.cvtColor(channels,cv2.COLOR_YCrCb2BGR)
cv2.imshow('dst',result)
cv2.waitKey(0)
效果:
重要提示:
import cv2
import numpy as np
# 创建损坏图片
img = cv2.imread('image0.jpg',1).copy()
for i in range(200,300):
img[i,200] = (255,255,255)
img[i,200+1] = (255,255,255)
img[i,200-1] = (255,255,255)
for i in range(150,250):
img[200,i] = (255,255,255)
img[200+1,i] = (255,255,255)
img[200-1,i] = (255,255,255)
damaged = img
cv2.imshow('damaged',damaged)
cv2.imwrite('damaged.jpg',damaged)
# 创建修补区域蒙版
damagedinfo = damaged.shape
height = damagedinfo[0]
width = damagedinfo[1]
paint = np.zeros((height,width,1),np.uint8)
for i in range(200,300):
paint[i,200] = 255
paint[i,200+1] = 255
paint[i,200-1] = 255
for i in range(150,250):
paint[200,i] = 255
paint[200+1,i] = 255
paint[200-1,i] = 255
cv2.imshow('paint',paint)
# 修复
imgdst = cv2.inpaint(img,paint,3,cv2.INPAINT_TELEA)
cv2.imshow('imgdst',imgdst)
cv2.waitKey(0)
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
count = np.zeros(256,np.float)
for i in range(height):
for j in range(width):
pixel = gray[i,j]
index = int(pixel)
count[index] += 1
for i in range(255):
count[i] = count[i]/(height*width)
x = np.linspace(0,255,256)
y = count
plt.bar(x,y,0.9,alpha=1,color='b')# 0.9是每一个柱体占的百分比
plt.show
cv2.waitKey(0)
import cv2
import numpy as np
import matplotlib.pyplot as plt
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
countb = np.zeros(256,np.float)
countg = np.zeros(256,np.float)
countr = np.zeros(256,np.float)
for i in range(height):
for j in range(width):
(b,g,r) = img [i,j]
indexb = int(b)
indexg = int(g)
indexr = int(r)
countb[indexb] += 1
countg[indexg] += 1
countr[indexr] += 1
for i in range(0,256):
countb[i] = countb[i]/(height*width)
countg[i] = countg[i]/(height*width)
countr[i] = countr[i]/(height*width)
x = np.linspace(0,255,256)# 此命令把0-255平均分为256分,顾头顾尾
y1 = countb
plt.figure()
plt.bar(x,y1,0.9,alpha=1,color='b')
y2 = countg
plt.figure()
plt.bar(x,y2,0.9,alpha=1,color='g')
y1 = countr
plt.figure()
plt.bar(x,y3,0.9,alpha=1,color='r')
plt.show
cv2.waitKey(0)
重要提示:
import cv2
import numpy as np
img = cv2.imread('image0.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
dst = np.zeros_like(img)
for i in range(height):
for j in range(width):
(b,g,r) = img[i,j]
bb = b + 40
gg = g + 40
rr = r + 40
'''bb = b*1.2 + 40
gg = g*1.3 + 40
rr = r*1.1 + 40# 让各通道成比例增加'''
if bb>255:
bb = 255
if gg>255:
gg = 255
if rr>255:
rr = 255
dst[i,j] = (bb,gg,rr)
cv2.imshow('scr',img)
cv2.imshow('dst',dst)
cv2.waitKey(0)
效果:
重要提示:
import cv2
img = cv2.imread('1.png',1)
dst = cv2.bilateralFilter(img,15,35,35)
cv2.imshow('scr',img)
cv2.imshow('dst',dst)
cv2.waitKey(0)
高斯滤波:
import cv2
import numpy as np
img = cv2.imread('image11.jpg',1)
dst = cv2.GaussianBlur(img,(5,5),1.5)#5*5是滤波核尺寸
cv2.imshow('scr',img)
cv2.imshow('dst',dst)
cv2.waitKey(0)
效果:
均值滤波:
import cv2
import numpy as np
# 均值滤波
img = cv2.imread('image11.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
dst = np.zeros_like(img)
for i in range(3,height-3):
for j in range(3,width-3):
sumb = int(0)
sumg = int(0)
sumr = int(0)
for m in range(-3,3):
for n in range(-3,3):
(b,g,r) = img[i+m,j+n]
sumb += int(b)
sumg += int(g)
sumr += int(r)
b = np.uint8(sumb/36)
g = np.uint8(sumg/36)
r = np.uint8(sumr/36)
dst[i,j] = (b,g,r)
cv2.imshow('scr',img)
cv2.imshow('dst',dst)
cv2.waitKey(0)
效果:
重要提示:
import cv2
import numpy as np
img = cv2.imread('image11.jpg',1)
imginfo = img.shape
height = imginfo[0]
width = imginfo[1]
img = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
cv2.imshow('scr',img)
dst = np.zeros_like(img)
collect = np.zeros(9,np.uint8)
for i in range(1,height-1):
for j in range(1,width-1):
k = 0
for m in range(-1,2):
for n in range(-1,2):
gray = img[i+m,j+n]
collect[k] = gray
k += 1
for k in range(9):
p = collect[k]
for t in range(k+1,9):
if collect[t]>p:
mid = collect[t]
collect[t] = p
p = mid
dst[i,j] = collect[4]
cv2.imshow('dst',dst)
cv2.waitKey(0)
效果:
重要提示:
机器学习:样本+特征+分类器
深度学习:样本(海量)+人工神经网络
机器学习需要明确的特征来提取。而深度学习中,因为有海量的样本,可以自动提取样本的特征进行判别,而人可能不知道所提取的特征是什么,而只是程序自身能够解读。
Haar特征是做人脸检测常用到的。Hog特征是做行人检测、车辆检测、物体检测等常用到的。
对特征进行判决,是通过分类器进行的。Adaboost分类器、SVM分类器。
本章仅仅围绕样本、特征、分类三部分进行学习。
import cv2
cap = cv2.VideoCapture('1.mp4')# 获取视频
isopened = cap.isOpened# 判断视频打开与否,返回布尔型
print(isopened)
fps = cap.get(cv2.CAP_PROP_FPS)# 获取帧率
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))# 获取视频宽度
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))# 获取视频高度
print (fps,width,height)
i = 0
while(isopened):# 每经过一次循环顺序读取一帧,然后下次循环自动读取下一帧
(flag,frame) = cap.read()# 从首帧开始,执行一次命令读取一帧
# 以下命令旨在读取第11-20帧
if i > 10:
if flag == True:# 若读取到该帧
filename = 'frames/'+str(i)+'.jpg'
cv2.imwrite(filename,frame,[cv2.IMWRITE_JPEG_QUALITY,100])
# 写入到指定文件夹下,并定义图片质量
print (filename)
if i==20:
break
else:
i += 1
import cv2
# 获取欲生成视频的尺寸
img = cv2.imread('frames/11.jpg',1)
imginfo = img.shape
size = (imginfo[1],imginfo[0])# 宽高
print (size)
videowrite = cv2.VideoWriter('2.mp4',-1,5,size)
# 定义写入视频的方法,参数分别为,1文件路径及名称,2编码器,3帧率,4视频尺寸
for i in range(11,21):
filename = 'frames/'+str(i)+'.jpg'
img = cv2.imread(filename)
videowrite.write(img)# 调用写入方法中的write方法写入一张图片
print ('end')
# 使用for循环可按顺序写入各张图片
Haar特征是基于灰度图片进行计算的。
特征本质上就是,将像素进行运算得到的结果(结果可能是具体值,向量,矩阵,多维数组等)。然后对特征进行阈值判决,判决是通过机器学习得到的。
Haar特征的计算,其实就是通过设置一系列的核模型,对图像的像素进行处理。
Haar特征的计算是通过白色核区域的像素和减去黑色核区域的像素通过核的水平和竖直平移,使得核遍历图片的所有区域。在遍历过程中,还涉及了核的缩放等,处理一张1080*960的图可能需要1000亿次运算。基于如此大的运算量,提出了一种积分图计算。
通过将一个区域切分成上述1234这4块,构建ABCD四个部分,其中A指左上角块,B指左上+右上,C指左上+左下,D指左上+左下+右上+右下。通过ABCD四个区域的三次加减运算,可得到任意一个方块的值,这大大简化了计算!例如4 = A -B-C+D
分类器介绍:
一个Adaboost分类器由多个强分类器组成,而每个强分类器又由若干个弱分类其组成,每个弱分类器又由至多3个Node节点构成。
分类器训练:
和深度学习类似,都是通过初始化权重分布,来训练数据,得到最佳的权重分布。每一次的训练会得到新的权重和本次训练的精度(概率)。训练的终止条件为:训练次数达到要求或训练精度达到要求即可!
人脸检测:
直接调用训练好的分类器(人脸和眼睛两个),即下文中的xml文件。
import cv2
import numpy as np
# load xml,该文件是分类器(在OpenCV官网可以下载到,是已经训练好的)
facexml = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
eyexml = cv2.CascadeClassifier('haarcascade_eye.xml')
# load img
img = cv2.imread('face.jpg',1)
cv2.imshow('scr',img)
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# detect face and eye
faces = facexml.detectMultiScale(gray,1.3,5)
# 调用分类器进行识别,参数2为检测核的缩放比例,
# 参数3是判断为人脸的像素区域的下限,也就是人脸区域不得小于5个像素
print ('faces = ',len(faces))
# draw
for (x,y,w,h) in faces:
cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)# 左上右下颜色线宽
# detect eye
roiface = gray[y:y+h,x:x+w]
roicolor = img[y:y+h,x:x+w]
eyes = eyexml.detectMultiScale(roiface)
print ('eyes = ',len(eyes))
for (x1,y1,w1,h1) in eyes:
cv2.rectangle(roicolor,(x1,y1),(x1+w1,y1+h1),(0,255,0),2)
cv2.imshow('dst',img)
cv2.waitKey(0)
SVM本质:寻求一个最优的超平面对数据进行分类,超平面可以是直线或者曲线。其实还是一个分类器!
import cv2
import numpy as np
# 导入训练数据和标签(监督学习)
rand1 = np.array([[155,48],[159,50],[164,53],[168,56],[172,60]])
rand2 = np.array([[152,53],[156,55],[160,56],[172,64],[176,65]])
data = np.vstack((rand1,rand2))# 合并 数组
data = np.array(data,dtype='float32')
label = np.array([[0],[0],[0],[0],[0],[1],[1],[1],[1],[1]])
# SVM的创建和属性设置
svm = cv2.ml.SVM_create()# ml机器学习模块的创建
svm.setType(cv2.ml.SVM_C_SVC)# svm类型设置
svm.setKernel(cv2.ml.SVM_LINEAR)# 线性分类
svm.setC(0.01)
# 训练
result = svm.train(data,cv2.ml.ROW_SAMPLE,label)
# 预测
ptdata = np.vstack([[167,55],[162,57]])# 0女生 1男生
ptdata = np.array(ptdata,dtype='float32')
print(ptdata)
(par1,par2) = svm.predict(ptdata)
print (par2)