一、前言
头部姿态估计(基于PaddleHub发布的人脸关键点检测模型face_landmark_localization,该模型转换自https://github.com/lsy17096535/face-landmark)对二维图像进行头部姿态估计,得出Yaw(摇头)、Pitch(点头)、Roll(摆头)三个参数,实现机器对图片人物姿态进行解释。(规定:Yaw左正右负,Pitch上负下正,Roll左负右正)
二、基本思路
通过模型face_landmark_localization获得图片中的人脸关键点,然后投影到三维人脸模型上,根据二维和三维坐标变换关系矩阵,求解欧拉角,得出参数。
涉及的公式:
三、实验过程
1、导入模块和调用包
import cv2
import numpy as np
import paddlehub as hub
import math
2、加载人脸关键点检测模型,并写入头部三维关键点坐标,以及要在图片上显示投影框的头部投影点坐标
self.module = hub.Module(name="face_landmark_localization")
#头部三维关键点坐标
self.model_points = np.array([
[6.825897, 6.760612, 4.402142],
[1.330353, 7.122144, 6.903745],
[-1.330353, 7.122144, 6.903745],
[-6.825897, 6.760612, 4.402142],
[5.311432, 5.485328, 3.987654],
[1.789930, 5.393625, 4.413414],
[-1.789930, 5.393625, 4.413414],
[-5.311432, 5.485328, 3.987654],
[2.005628, 1.409845, 6.165652],
[-2.005628, 1.409845, 6.165652],
[2.774015, -2.080775, 5.048531],
[-2.774015, -2.080775, 5.048531],
[0.000000, -3.116408, 6.097667],
[0.000000, -7.415691, 4.070434],
[-7.308957, 0.913869, 0.000000],
[7.308957, 0.913869, 0.000000],
[0.746313, 0.348381, 6.263227],
[0.000000, 0.000000, 6.763430],
[-0.746313, 0.348381, 6.263227]
], dtype='float')
#头部投影点
self.reprojectsrc = np.float32([
[10.0, 10.0, 10.0],
[10.0, -10.0, 10.0],
[-10.0, 10.0, 10.0],
[-10.0, -10.0, 10.0]])
#投影点连线
self.line_pairs = [
[0, 2], [1, 3], [0, 1], [2, 3]]
3、从face_landmark_localization的检测结果抽取姿态估计需要的点坐标
image_points = np.array([
face_landmark[17], face_landmark[21],
face_landmark[22], face_landmark[26],
face_landmark[36], face_landmark[39],
face_landmark[42], face_landmark[45],
face_landmark[31], face_landmark[35],
face_landmark[48], face_landmark[54],
face_landmark[57], face_landmark[8],
face_landmark[14], face_landmark[2],
face_landmark[32], face_landmark[33],
face_landmark[34],
], dtype='float')
4、获取旋转向量和平移向量
#设定相机的焦距、图像的中心位置并且假设不存在径向畸变,获得相机内参数矩阵
center = (self.img_size[1] / 2, self.img_size[0] / 2)
focal_length = self.img_size[1]
camera_matrix = np.array([
[focal_length, 0, center[0]],
[0, focal_length, center[1]],
[0, 0, 1]],
dtype="float")
dist_coeffs = np.zeros((4, 1))
ret, rotation_vector, translation_vector = cv2.solvePnP(self.model_points,image_points,camera_matrix,dist_coeffs)
5、计算欧拉角
rvec_matrix = cv2.Rodrigues(rotation_vector)[0]
proj_matrix = np.hstack((rvec_matrix, translation_vector))
euler_angles = cv2.decomposeProjectionMatrix(proj_matrix)[6]
pitch, yaw, roll = [math.radians(_) for _ in euler_angles]
6、在图片中显示参数和投影框
frames_euler = []
img = photo
self.img_size = img.shape
#画出投影框
alpha = 0.3
if not hasattr(self, 'before'):
self.before = reprojectdst
else:
reprojectdst = alpha * self.before + (1 - alpha) * reprojectdst
reprojectdst = tuple(map(tuple, reprojectdst.reshape(4, 2)))
for start, end in self.line_pairs:
v2.line(img, reprojectdst[start], reprojectdst[end], (0, 0, 255))
#显示参数
cv2.putText(img, "pitch: " + "{:7.2f}".format(pitch), (20, int(self.img_size[0] / 2 - 10)),cv2.FONT_HERSHEY_SIMPLEX,0.75, (0, 0, 255), thickness=2)
cv2.putText(img, "yaw: " + "{:7.2f}".format(yaw), (20, int(self.img_size[0] / 2 + 30)),cv2.FONT_HERSHEY_SIMPLEX,0.75, (0, 0, 255), thickness=2)
cv2.putText(img, "roll: " + "{:7.2f}".format(roll), (20, int(self.img_size[0] / 2 + 70)),cv2.FONT_HERSHEY_SIMPLEX,0.75, (0, 0, 255), thickness=2)
frames_euler.append([img, pitch, yaw, roll])
cv2.imshow('headpost', img)
cv2.waitKey(0)
7、插入图片
photo=cv2.imread('hbi.jpg')
四、遇到的问题
1、import paddlehub as hub 出现错误,复现如下
File "D:/python代码/pycharm代码/头部姿态估计.py", line 3, in
import paddlehub as hub
File "C:\Users\86183\AppData\Roaming\Python\Python37\site-packages\paddlehub\__init__.py", line 12, in
from . import module
File "C:\Users\86183\AppData\Roaming\Python\Python37\site-packages\paddlehub\module\__init__.py", line 16, in
from . import module
File "C:\Users\86183\AppData\Roaming\Python\Python37\site-packages\paddlehub\module\module.py", line 31, in
from paddlehub.common import utils
File "C:\Users\86183\AppData\Roaming\Python\Python37\site-packages\paddlehub\common\__init__.py", line 16, in
from . import utils
File "C:\Users\86183\AppData\Roaming\Python\Python37\site-packages\paddlehub\common\utils.py", line 33, in
from paddlehub.common.logger import logger
File "C:\Users\86183\AppData\Roaming\Python\Python37\site-packages\paddlehub\common\logger.py", line 155, in
logger = Logger()
File "C:\Users\86183\AppData\Roaming\Python\Python37\site-packages\paddlehub\common\logger.py", line 67, in __init__
level = json.load(fp).get("log_level", "DEBUG")
File "D:\Anaconda\lib\json\__init__.py", line 296, in load
parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "D:\Anaconda\lib\json\__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "D:\Anaconda\lib\json\decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "D:\Anaconda\lib\json\decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
卸载重装PaddleHub也未能解决,终于在GitHub社区中找到解决办法
将以下代码拷贝到.paddlehub/conf/config.json中成功解决
{
"server_url": [
"http://paddlepaddle.org.cn/paddlehub"
],
"resource_storage_server_url": "https://bj.bcebos.com/paddlehub-data/",
"debug": false,
"log_level": "DEBUG"
}
2、最终显示效果图时,一直显示空白内容,最后才发现由于使用的cv2.imshow()展示结果过快,使图片一闪而过,在其后加入cv2.waitkey(0)成功解决。(参数0表示一直等待)
3、最开始我使用常用的14个人脸关键点进行检测,发现展示效果不好,于是采用AI Studio中头部姿态点头、摇头估计项目中19个人脸关键点,展示效果更好。对比效果图如下,14个人脸关键点(上),19个人脸关键点(下)(Yaw左正右负,Pitch上负下正,Roll左负右正)
|
五、最终效果
六、总结
基于PaddleHub的头部姿态估计课题对我来说十分不容易,在学习各种坐标系如何进行转换,如何求得旋转和平移矩阵以及怎样换算欧拉角后,我更加深刻体会到数学对学习人工智能的重要地位。另外,我对于常用包和模块的运用不熟练,接下来我会更加深入python的学习,同时也会用PaddleHub尝试其他项目,努力提升自己。
参考文献:
https://blog.csdn.net/cdknight_happy/article/details/79975060
https://zhuanlan.zhihu.com/p/82064640
https://www.sohu.com/a/278664242_100007727
https://aistudio.baidu.com/aistudio/projectdetail/673271