感谢恩培大佬对项目进行了完整的实现,并将代码进行开源,供大家交流学习。
一、项目简介
本项目最终达到的效果为识别毛笔字,并对字体进行分类。如下所示
项目用python实现,调用opencv等库,使用SVM对字体进行分类,由以下步骤组成:
1、使用OpenCV读取摄像头视频流;
2、传统CV操作提取毛笔字;
3、使用SVM对字体进行分类。
cv2.cvtColor:图片颜色空间转换函数。图片由彩色转换为黑白,使颜色空间变简单,毛笔字特征更容易提取。
cv2.threshold:设定一个阈值,将图片二值化,分割毛笔字与背景。
cv2.dilate、cv2.erode:形态学腐蚀膨胀操作,将图片的白色部分变胖变瘦,用于排除小黑洞。详见:https://blog.csdn.net/qq_39507748/article/details/104539073
SVM:传统的机器学习分类器,用于对毛笔字体进行分类。
import cv2
import numpy as np
from utils import Utils
import time
utils = Utils()
cap = cv2.VideoCapture('./videos/raw.mp4')
width = int( cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int( cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
frame_index = 0
fpsTime = time.time()
videoWriter = cv2.VideoWriter('./record_video/out'+str(time.time())+'.mp4', cv2.VideoWriter_fourcc(*'H264'), 15, (1080,1920))
while True:
ret,frame = cap.read()
frame = cv2.resize(frame,(1080,1920))
img_copy = frame.copy()
# 遮罩手部:可以用手部检测器实现,
frame[1550:1920,700:1080] = 255
if 154 >frame_index> 50:
# cv2.rectangle(frame,(0,1300),(300,1920),(255,0,255),10)
# cv2.rectangle(frame,(300,1400),(1080,1920),(255,0,255),10)
frame[1300:1920,0:300] = 255
frame[1400:1920,300:1080] = 255
if 46 >frame_index> 38:
cv2.rectangle(frame,(300,400),(600,1920),(255,0,255),10)
frame[400:1920,300:600] = 255
# frame[1400:1920,300:1080] = 255
# frame = cv2.rotate(frame,cv2.ROTATE_90_CLOCKWISE)
# 灰度
gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)
# 二值化
retval, black_img = cv2.threshold(gray,100,255,cv2.THRESH_BINARY_INV)
# 腐蚀
kernel = np.ones((3,3),dtype=np.int8)
erosion = cv2.erode(black_img,kernel,iterations = 2)
# 再膨胀,连接主体
kernel = np.ones((10,10),dtype=np.int8)
dialation = cv2.dilate(erosion,kernel,iterations = 2)
# # 在膨胀基础上闭合
kernel = np.ones((10,10),dtype=np.int8)
closing = cv2.morphologyEx(dialation,cv2.MORPH_CLOSE,kernel)
# # 复制底图
# 再次取边缘、轮廓
edged = cv2.Canny(closing.copy(),30,200)
contours, hierarchy = cv2.findContours(edged,
cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)
for c in contours:
x,y,w,h = cv2.boundingRect(c)
# 过滤
if( 300> w > 100 ) and ( 300>h > 100):
cv2.rectangle(img_copy,(x,y),(x+w,y+h),(0,255,0),10)
frame_index+=1
# time.sleep(0.1)
cTime = time.time()
fps_text = 1/(cTime-fpsTime)
fpsTime = cTime
# img_copy = utils.cv2AddChineseText(img_copy, "帧率: " + str(int(fps_text)) , (20, 100), textColor=(255, 0, 255), textSize=100)
videoWriter.write(img_copy)
img_copy = cv2.resize(img_copy,(int(img_copy.shape[1]/3),int(img_copy.shape[0]/3)))
cv2.imshow('demo',img_copy)
if cv2.waitKey(10) & 0xFF == ord('q'):
break
cap.release()
cv2.destroyAllWindows()