代码来源:GitHub - Daniil-Osokin/lightweight-human-pose-estimation.pytorch: Fast and accurate human pose estimation in PyTorch. Contains implementation of "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose" paper.
参考文章:(11条消息) 人体姿态识别代码深度解析,带你一步步理解代码_是你吗小叶同学的博客-CSDN博客_识别人体代码
(11条消息) light_openpose代码_微凉code的博客-CSDN博客
学习视频:
【openpose姿态估计】计你太美!同济大佬肝爆90分钟讲解的openpose系列算法解读教程,简直不要太详细!-人工智能|AI|计算机视觉|深度学习。_哔哩哔哩_bilibili
infer_fast函数用于返回heatmap和pafs。
def infer_fast(net, img, net_input_height_size, stride, upsample_ratio, cpu,
pad_value=(0, 0, 0), img_mean=np.array([128, 128, 128], np.float32), img_scale=np.float32(1/256)):
height, width, _ = img.shape # 长、宽、通道数
scale = net_input_height_size / height # 计算缩放比例
# img-原图、fx/fy--在x轴y轴上的缩放比例,cv2.INTER_LINEAR--双线性插值
scaled_img = cv2.resize(img, (0, 0), fx=scale, fy=scale, interpolation=cv2.INTER_LINEAR)
scaled_img = normalize(scaled_img, img_mean, img_scale) # 去均值,img_mean, img_scale在函数参数中有定义
min_dims = [net_input_height_size, max(scaled_img.shape[1], net_input_height_size)]
padded_img, pad = pad_width(scaled_img, stride, pad_value, min_dims) # 填充后图片
tensor_img = torch.from_numpy(padded_img).permute(2, 0, 1).unsqueeze(0).float()
#if not cpu:
#tensor_img = tensor_img.cuda()
stages_output = net(tensor_img) # net为PoseEstimationWithMobileNet(),返回值由backbonefeature,heatmap,pafs组成
stage2_heatmaps = stages_output[-2]
heatmaps = np.transpose(stage2_heatmaps.squeeze().cpu().data.numpy(), (1, 2, 0))
heatmaps = cv2.resize(heatmaps, (0, 0), fx=upsample_ratio, fy=upsample_ratio, interpolation=cv2.INTER_CUBIC)
stage2_pafs = stages_output[-1]
pafs = np.transpose(stage2_pafs.squeeze().cpu().data.numpy(), (1, 2, 0))
pafs = cv2.resize(pafs, (0, 0), fx=upsample_ratio, fy=upsample_ratio, interpolation=cv2.INTER_CUBIC)
return heatmaps, pafs, scale, pad
permute函数将tensor的维度进行相应转换,原本的tensor维度为(1,3,3)现在变成了(3,1,3),然后通过unsqueeze函数又在0层创造了一维,所以现在的tensor从3维变成4维。
np.transpose
normalize函数是在val.py导入的
def normalize(img, img_mean, img_scale):
img = np.array(img, dtype=np.float32)
img = (img - img_mean) * img_scale
return img
作用是将scale_img图片类型变成float类型并按照矩阵输出,然后将其的参数减去img_mean('[128,128,128]')再乘以img_scale(1/256)。去均值可以避免过拟合
pad_width函数是在val.py导入的
def pad_width(img, stride, pad_value, min_dims):
h, w, _ = img.shape
h = min(min_dims[0], h)
min_dims[0] = math.ceil(min_dims[0] / float(stride)) * stride # math.ceil--向上取整,这操作是为了能被步长整除完整卷积
min_dims[1] = max(min_dims[1], w)
min_dims[1] = math.ceil(min_dims[1] / float(stride)) * stride
pad = []
pad.append(int(math.floor((min_dims[0] - h) / 2.0))) # math.floor--向下取整, 计算上边pad的像素值
pad.append(int(math.floor((min_dims[1] - w) / 2.0))) # 左边pad的像素值
pad.append(int(min_dims[0] - h - pad[0])) # 下
pad.append(int(min_dims[1] - w - pad[1])) # 右
padded_img = cv2.copyMakeBorder(img, pad[0], pad[2], pad[1], pad[3],
cv2.BORDER_CONSTANT, value=pad_value) # 给图片打pad
return padded_img, pad
个人理解这里面的min_dims是用来决定最后pad后的图片大小
cv2.copyMakeBorder函数
extract_keypoints函数从keypoints.py中导入
def extract_keypoints(heatmap, all_keypoints, total_keypoint_num):
heatmap[heatmap < 0.1] = 0 # 背景/相关度低的区域变黑?
heatmap_with_borders = np.pad(heatmap, [(2, 2), (2, 2)], mode='constant') # 填充边框,返回带边框的图像
# 取中心图,去除周围一圈边框? 切片操作
heatmap_center = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 1:heatmap_with_borders.shape[1]-1]
# 去除左边 0维是高、1维是宽
heatmap_left = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 2:heatmap_with_borders.shape[1]]
heatmap_right = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 0:heatmap_with_borders.shape[1]-2]
heatmap_up = heatmap_with_borders[2:heatmap_with_borders.shape[0], 1:heatmap_with_borders.shape[1]-1]
heatmap_down = heatmap_with_borders[0:heatmap_with_borders.shape[0]-2, 1:heatmap_with_borders.shape[1]-1]
heatmap_peaks = (heatmap_center > heatmap_left) &\
(heatmap_center > heatmap_right) &\
(heatmap_center > heatmap_up) &\
(heatmap_center > heatmap_down)
heatmap_peaks = heatmap_peaks[1:heatmap_center.shape[0]-1, 1:heatmap_center.shape[1]-1]
# np.nonzero得到数组中非零元素索引
keypoints = list(zip(np.nonzero(heatmap_peaks)[1], np.nonzero(heatmap_peaks)[0])) # (w, h)
keypoints = sorted(keypoints, key=itemgetter(0)) # 按照第0维元素(w)升序排序,
suppressed = np.zeros(len(keypoints), np.uint8)
keypoints_with_score_and_id = []
keypoint_num = 0
for i in range(len(keypoints)):
# 判断第i个关键点是否是重复判断
if suppressed[i]:
continue
for j in range(i+1, len(keypoints)):
# 如果两个点的位置接近,认为是同一个关键点
if math.sqrt((keypoints[i][0] - keypoints[j][0]) ** 2 +
(keypoints[i][1] - keypoints[j][1]) ** 2) < 6:
suppressed[j] = 1
keypoint_with_score_and_id = (keypoints[i][0], keypoints[i][1], heatmap[keypoints[i][1], keypoints[i][0]],
total_keypoint_num + keypoint_num)
keypoints_with_score_and_id.append(keypoint_with_score_and_id)
keypoint_num += 1
# 添加提取的关节点和置信度?[x,y,conf,id] conf为在热图里的位置?
all_keypoints.append(keypoints_with_score_and_id)
return keypoint_num
np.pad()函数
zip函数在 Python 3.x 中为了减少内存,zip() 返回的是一个对象。如需展示列表,需手动 list() 转换;如需展示字典,需手动 dict() 转换。
itemgetter
group_keypoints函数从keypoint.py导入
优雅展平数组:
all_keypoints = np.array([item for sublist in all_keypoints_by_type for item in sublist])
reshape
流程图