基于部分亲和字段PAF(Part Affinity Field)的2D图像姿态估计(openpose)

 

该文章出自2017年的CVPR,Realtime Multi-Person 2D Pose Estimation using Part Affinity Field,是CMU的工作,效果真的amazing。

也许这篇文章的亮点在于,融合了PCM和PAF的级联cascade形网络结构,网络设计思想和RefineNet的网络设计思想很像,以及相应条件约束的偶匹配(bipartite matchings)算法。

基于部分亲和字段PAF(Part Affinity Field)的2D图像姿态估计(openpose)_第1张图片

 

整个检测过程如上图所示,输入一幅图像,然后经过7个stage,得到PCM和PAF。然后根据PAF生成一系列的偶匹配,由于PAF自身的矢量性,使得生成的偶匹配很正确,最终合并为一个人的整体骨架。

 

         在models/pose/mpi/下面有2个模型,pose_deploy_linevec.prototxt模型相对更准确一些,pose_deploy_linevec_faster_4_stages.prototxt模型精度稍微有所下降,大概2个点,但是速度提升有30%。

         这2个模型的区别就是,后者去掉了stage5,stage6这2个卷积模块(如下图所示,每个stage由一系列的卷积层组成,其中Branch1由1个列组成,Branch2由2个列组成,在进入下一个stage前,上一个stage的3个列进行融合)。

         models/pose/coco/pose_deploy_linevec.prototxt和models/pose/mpi/pose_deploy_linevec.prototxt网络结构一样,区别就是coco的卷基层滤波器数目更多点。这个模型是3个模型中最精确的一个。

         模型的设置可以在,examples/openpose/openpose.cpp中设置,默认调用COCO的model

DEFINE_string(model_pose,"COCO","Modelto be used (e.g. COCO, MPI, MPI_4_layers).");

基于部分亲和字段PAF(Part Affinity Field)的2D图像姿态估计(openpose)_第2张图片

 

安装步骤(环境centos):

git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose.git
cd 3rdparty/caffe
cp Makefile.config.Ubuntu14.example Makefile.config #修改其中的路径为自己计算机路径
make all -j8

cd ../../models/
./getModels.sh 
cd ..
cp Makefile.config.Ubuntu14.example Makefile.config #修改其中的路径为自己计算机路径
make -j8

 

可能错误:

/tmp/cciLUahT.s:1660: Error: no suchinstruction: `vextracti128 $0x1,%ymm0,%xmm0'

make: ***[.build_release/src/openpose/gui/guiInfoAdder.o] 错误 1

make: *** 正在等待未完成的任务....

/tmp/ccpTDNgT.s:3892: Error: no suchinstruction: `vextracti128 $0x1,%ymm0,%xmm0'

make: ***[.build_release/src/openpose/gui/gui.o] 错误 1

解决方法:注释掉Makefile中的汇编优化,204行,CXXFLAGS += -march=native

可能错误:

ERROR: something wrong with flag'tab_completion_word' in file

解决方法:

去掉Makefile中144行,

LIBRARIES += glog gflags boost_systemboost_filesystem m hdf5_hl hdf5 caffe

中的gflags

 

 

测试:

跑视频:

./build/examples/openpose/openpose.bin--video examples/media/video.avi

跑摄像头:

./build/examples/openpose/openpose.bin

跑图片:

./build/examples/openpose/openpose.bin--image_dir examples/media/

 

pets数据集上的测试效果:

 

openpose训练

训练代码:

https://github.com/ZheC/Realtime_Multi-Person_Pose_Estimation

 

训练需要使用的caffe

https://github.com/CMU-Perceptual-Computing-Lab/caffe_train

需要做如下修改

(1)delete opencv_contrib in Makefile
LIBRARIES += opencv_core opencv_highgui opencv_imgproc (delete opencv_contrib)

(2)remove the inlucde header
./src/caffe/cpm_data_transformer.cpp
#include

 

matlab的coco工具箱

注意需要使用linux环境,windows环境没有对应mex不支持

https://github.com/cocodataset/cocoapi

 

训练:

cd Realtime_Multi-Person_Pose_Estimation-master/training

bash getData.sh

/usr/local/MATLAB/R2017a/bin/matlab -nojvm -nodesktop -nodisplay -r getANNO

/usr/local/MATLAB/R2017a/bin/matlab -nojvm -nodesktop -nodisplay -r genCOCOMask

/usr/local/MATLAB/R2017a/bin/matlab -nojvm -nodesktop -nodisplay -r genJSON

python genLMDB.py

python setLayers.py --exp 1



cd dataset/COCO/COCO_kpt/pose56/exp22

bash train_pose.sh 0,1

网络训练结构图:

genJSON函数生成的json格式:

{
	"root": [{
		"dataset": "COCO",#数据集,string
		"isValidation": 0.000,#是否可见,0可见,1不可见,float
		"img_paths": "train2014/COCO_train2014_000000000308.jpg",#图像路径,string
		"img_width": 640.000,#图像宽度,float
		"img_height": 426.000,#图像高度,float
		"objpos": [201.540, 226.370],#物体中心坐标,float
		"image_id": 308.000,#图片ID,float
		"bbox": [134.680, 28.650, 133.720, 395.440],#物体边框,float
		"segment_area": 23904.367,#物体区域面积,float
		"num_keypoints": 15.000,#可见的关键点个数,float
		"joint_self": [#中心物体的关键点坐标,x,y,label,float
#label=0:可见,正常,不需裁剪
#label=1:存在不可见部分,有遮挡occluded
#label=2:身体部分被裁剪掉,cropped
			[209.000, 82.000, 1.000],
			[217.000, 74.000, 1.000],
			[198.000, 73.000, 1.000],
			[220.000, 64.000, 1.000],
			[180.000, 68.000, 1.000],
			[227.000, 118.000, 1.000],
			[152.000, 123.000, 1.000],
			[0.000, 0.000, 2.000],
			[159.000, 195.000, 0.000],
			[0.000, 0.000, 2.000],
			[208.000, 196.000, 1.000],
			[235.000, 245.000, 1.000],
			[190.000, 254.000, 1.000],
			[252.000, 320.000, 0.000],
			[195.000, 346.000, 1.000],
			[250.000, 381.000, 1.000],
			[200.000, 422.000, 1.000]
		],
		"scale_provided": 1.075,#物体边框的高度/368,float
		"joint_others": [#其他物体,不在图片中心的物体的关键点坐标
			[
				[52.000, 135.000, 1.000],
				[63.000, 123.000, 1.000],
				[34.000, 119.000, 1.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[11.000, 209.000, 0.000],
				[0.000, 0.000, 2.000],
				[43.000, 332.000, 1.000],
				[0.000, 0.000, 2.000],
				[129.000, 251.000, 1.000],
				[0.000, 0.000, 2.000],
				[59.000, 404.000, 1.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[115.000, 75.000, 1.000],
				[117.000, 64.000, 1.000],
				[106.000, 63.000, 1.000],
				[0.000, 0.000, 2.000],
				[71.000, 58.000, 1.000],
				[0.000, 0.000, 2.000],
				[52.000, 141.000, 0.000],
				[0.000, 0.000, 2.000],
				[75.000, 226.000, 0.000],
				[0.000, 0.000, 2.000],
				[135.000, 265.000, 0.000],
				[0.000, 0.000, 2.000],
				[90.000, 274.000, 0.000],
				[0.000, 0.000, 2.000],
				[88.000, 390.000, 0.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[142.000, 98.000, 1.000],
				[142.000, 96.000, 1.000],
				[138.000, 96.000, 1.000],
				[0.000, 0.000, 2.000],
				[122.000, 99.000, 1.000],
				[141.000, 122.000, 0.000],
				[116.000, 127.000, 0.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[448.000, 98.000, 1.000],
				[453.000, 84.000, 1.000],
				[0.000, 0.000, 2.000],
				[489.000, 74.000, 1.000],
				[0.000, 0.000, 2.000],
				[523.000, 143.000, 1.000],
				[480.000, 126.000, 1.000],
				[517.000, 231.000, 1.000],
				[456.000, 178.000, 1.000],
				[442.000, 248.000, 1.000],
				[409.000, 191.000, 1.000],
				[503.000, 295.000, 1.000],
				[472.000, 278.000, 1.000],
				[481.000, 392.000, 1.000],
				[461.000, 375.000, 1.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[61.000, 360.000, 1.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000],
				[0.000, 0.000, 2.000]
			],
			[
				[361.000, 92.000, 1.000],
				[365.000, 86.000, 1.000],
				[0.000, 0.000, 2.000],
				[378.000, 86.000, 0.000],
				[0.000, 0.000, 2.000],
				[383.000, 107.000, 0.000],
				[360.000, 105.000, 1.000],
				[0.000, 0.000, 2.000],
				[351.000, 130.000, 1.000],
				[0.000, 0.000, 2.000],
				[349.000, 147.000, 1.000],
				[380.000, 157.000, 1.000],
				[361.000, 158.000, 1.000],
				[381.000, 204.000, 0.000],
				[363.000, 206.000, 1.000],
				[382.000, 241.000, 0.000],
				[366.000, 240.000, 0.000]
			]
		],
		"annolist_index": 15.000,#json的索引,float
		"people_index": 1.000,#人物的索引,float
		"numOtherPeople": 6.000,#除去中心的人,其他人的个数,float
		"scale_provided_other": [1.029, 0.723, 0.239, 1.076, 0.502, 0.479],#其他人的高度/368,float
		"objpos_other": [#其他人的中心坐标,float
			[92.540, 236.655],
			[127.225, 148.285],
			[128.435, 122.660],
			[468.990, 221.455],
			[59.830, 328.830],
			[370.745, 161.240]
		],
		"bbox_other": [#其他人的边框坐标,x,y,width,height,float
			[0.000, 47.310, 185.080, 378.690],
			[45.740, 15.340, 162.970, 265.890],
			[111.180, 78.760, 34.510, 87.800],
			[381.090, 23.470, 175.800, 395.970],
			[0.000, 236.450, 119.660, 184.760],
			[345.800, 73.100, 49.890, 176.280]
		],
		"segment_area_other": [28055.014, 15191.540, 1602.758, 36639.449, 8737.961, 3366.046],#其他人的分割的区域面积,float
		"num_keypoints_other": [7.000, 9.000, 6.000, 13.000, 1.000, 13.000]#其他人的可见关键点个数,float
	}]
}

可视化程序:

import json
import cv2

with open('coco1.json', 'r') as f:
    coco_json = json.load(f)

image =cv2.imread("COCO_train2014_000000000308.jpg",1)


for ann in coco_json["root"]:
    box = ann["bbox"]#x,y,width,height
    cv2.rectangle(image,(int(box[0]),int(box[1])),(int(box[0]+box[2]),int(box[1]+box[3])),(0,255,255),1)
    box_other =ann["bbox_other"]
    for b_o in box_other:
        cv2.rectangle(image,(int(b_o[0]),int(b_o[1])),(int(b_o[0]+b_o[2]),int(b_o[1]+b_o[3])),(0,0,255),1)

    joint_self=ann["joint_self"]
    for joint in joint_self:
        if (joint[2]==0):
            cv2.circle(image,(int(joint[0]),int(joint[1])),1,(255,0,0),1)
        if (joint[2]==1):
            cv2.circle(image,(int(joint[0]),int(joint[1])),1,(0,255,0),2)
    others=ann["joint_others"]
    for other in others:
        for oth in other:
            if (oth[2]==0):
                cv2.circle(image,(int(oth[0]),int(oth[1])),1,(255,0,0),1)
            if (oth[2]==1):
                cv2.circle(image,(int(oth[0]),int(oth[1])),1,(0,255,0),2)



cv2.imshow("image",image)
cv2.waitKey()

结果:

基于部分亲和字段PAF(Part Affinity Field)的2D图像姿态估计(openpose)_第3张图片

基于部分亲和字段PAF(Part Affinity Field)的2D图像姿态估计(openpose)_第4张图片

 

预训练模型链接:

https://download.csdn.net/download/qq_14845119/11616965

 

reference:

https://github.com/CMU-Perceptual-Computing-Lab/openpose

 

你可能感兴趣的:(姿态估计)