lightweight-openpose学习笔记--源码解析(1)demo.py

代码来源:GitHub - Daniil-Osokin/lightweight-human-pose-estimation.pytorch: Fast and accurate human pose estimation in PyTorch. Contains implementation of "Real-time 2D Multi-Person Pose Estimation on CPU: Lightweight OpenPose" paper.

参考文章:(11条消息) 人体姿态识别代码深度解析,带你一步步理解代码_是你吗小叶同学的博客-CSDN博客_识别人体代码

(11条消息) light_openpose代码_微凉code的博客-CSDN博客

学习视频:

【openpose姿态估计】计你太美!同济大佬肝爆90分钟讲解的openpose系列算法解读教程,简直不要太详细!-人工智能|AI|计算机视觉|深度学习。_哔哩哔哩_bilibili

infer_fast函数用于返回heatmap和pafs。

def infer_fast(net, img, net_input_height_size, stride, upsample_ratio, cpu,
               pad_value=(0, 0, 0), img_mean=np.array([128, 128, 128], np.float32), img_scale=np.float32(1/256)):
    height, width, _ = img.shape  # 长、宽、通道数
    scale = net_input_height_size / height  # 计算缩放比例

    # img-原图、fx/fy--在x轴y轴上的缩放比例,cv2.INTER_LINEAR--双线性插值
    scaled_img = cv2.resize(img, (0, 0), fx=scale, fy=scale, interpolation=cv2.INTER_LINEAR)
    scaled_img = normalize(scaled_img, img_mean, img_scale)  # 去均值,img_mean, img_scale在函数参数中有定义
    min_dims = [net_input_height_size, max(scaled_img.shape[1], net_input_height_size)]
    padded_img, pad = pad_width(scaled_img, stride, pad_value, min_dims)  # 填充后图片

    tensor_img = torch.from_numpy(padded_img).permute(2, 0, 1).unsqueeze(0).float()
    #if not cpu:
        #tensor_img = tensor_img.cuda()

    stages_output = net(tensor_img)  # net为PoseEstimationWithMobileNet(),返回值由backbonefeature,heatmap,pafs组成

    stage2_heatmaps = stages_output[-2]
    heatmaps = np.transpose(stage2_heatmaps.squeeze().cpu().data.numpy(), (1, 2, 0))
    heatmaps = cv2.resize(heatmaps, (0, 0), fx=upsample_ratio, fy=upsample_ratio, interpolation=cv2.INTER_CUBIC)

    stage2_pafs = stages_output[-1]
    pafs = np.transpose(stage2_pafs.squeeze().cpu().data.numpy(), (1, 2, 0))
    pafs = cv2.resize(pafs, (0, 0), fx=upsample_ratio, fy=upsample_ratio, interpolation=cv2.INTER_CUBIC)

    return heatmaps, pafs, scale, pad

permute函数将tensor的维度进行相应转换,原本的tensor维度为(1,3,3)现在变成了(3,1,3),然后通过unsqueeze函数又在0层创造了一维,所以现在的tensor从3维变成4维。

np.transpose

normalize函数是在val.py导入的

def normalize(img, img_mean, img_scale):
    img = np.array(img, dtype=np.float32)
    img = (img - img_mean) * img_scale
    return img

作用是将scale_img图片类型变成float类型并按照矩阵输出,然后将其的参数减去img_mean('[128,128,128]')再乘以img_scale(1/256)。去均值可以避免过拟合

pad_width函数是在val.py导入的

def pad_width(img, stride, pad_value, min_dims):
    h, w, _ = img.shape
    h = min(min_dims[0], h)
    min_dims[0] = math.ceil(min_dims[0] / float(stride)) * stride  # math.ceil--向上取整,这操作是为了能被步长整除完整卷积
    min_dims[1] = max(min_dims[1], w)
    min_dims[1] = math.ceil(min_dims[1] / float(stride)) * stride
    pad = []
    pad.append(int(math.floor((min_dims[0] - h) / 2.0)))  # math.floor--向下取整, 计算上边pad的像素值
    pad.append(int(math.floor((min_dims[1] - w) / 2.0)))  # 左边pad的像素值
    pad.append(int(min_dims[0] - h - pad[0]))  # 下
    pad.append(int(min_dims[1] - w - pad[1]))  # 右
    padded_img = cv2.copyMakeBorder(img, pad[0], pad[2], pad[1], pad[3],
                                    cv2.BORDER_CONSTANT, value=pad_value)  # 给图片打pad
    return padded_img, pad

个人理解这里面的min_dims是用来决定最后pad后的图片大小

cv2.copyMakeBorder函数

extract_keypoints函数从keypoints.py中导入

def extract_keypoints(heatmap, all_keypoints, total_keypoint_num):
    heatmap[heatmap < 0.1] = 0  # 背景/相关度低的区域变黑?
    heatmap_with_borders = np.pad(heatmap, [(2, 2), (2, 2)], mode='constant')  # 填充边框,返回带边框的图像
    # 取中心图,去除周围一圈边框?   切片操作
    heatmap_center = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 1:heatmap_with_borders.shape[1]-1]
    # 去除左边 0维是高、1维是宽
    heatmap_left = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 2:heatmap_with_borders.shape[1]]
    heatmap_right = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 0:heatmap_with_borders.shape[1]-2]
    heatmap_up = heatmap_with_borders[2:heatmap_with_borders.shape[0], 1:heatmap_with_borders.shape[1]-1]
    heatmap_down = heatmap_with_borders[0:heatmap_with_borders.shape[0]-2, 1:heatmap_with_borders.shape[1]-1]

    heatmap_peaks = (heatmap_center > heatmap_left) &\
                    (heatmap_center > heatmap_right) &\
                    (heatmap_center > heatmap_up) &\
                    (heatmap_center > heatmap_down)
    heatmap_peaks = heatmap_peaks[1:heatmap_center.shape[0]-1, 1:heatmap_center.shape[1]-1]
    # np.nonzero得到数组中非零元素索引
    keypoints = list(zip(np.nonzero(heatmap_peaks)[1], np.nonzero(heatmap_peaks)[0]))  # (w, h)
    keypoints = sorted(keypoints, key=itemgetter(0))  # 按照第0维元素(w)升序排序,

    suppressed = np.zeros(len(keypoints), np.uint8)
    keypoints_with_score_and_id = []
    keypoint_num = 0
    for i in range(len(keypoints)):
        # 判断第i个关键点是否是重复判断
        if suppressed[i]:
            continue
        for j in range(i+1, len(keypoints)):
            # 如果两个点的位置接近,认为是同一个关键点
            if math.sqrt((keypoints[i][0] - keypoints[j][0]) ** 2 +
                         (keypoints[i][1] - keypoints[j][1]) ** 2) < 6:
                suppressed[j] = 1
        keypoint_with_score_and_id = (keypoints[i][0], keypoints[i][1], heatmap[keypoints[i][1], keypoints[i][0]],
                                      total_keypoint_num + keypoint_num)
        keypoints_with_score_and_id.append(keypoint_with_score_and_id)
        keypoint_num += 1
    # 添加提取的关节点和置信度?[x,y,conf,id]  conf为在热图里的位置?
    all_keypoints.append(keypoints_with_score_and_id)
    return keypoint_num

np.pad()函数

zip函数在 Python 3.x 中为了减少内存,zip() 返回的是一个对象。如需展示列表,需手动 list() 转换;如需展示字典,需手动 dict() 转换。

itemgetter

group_keypoints函数从keypoint.py导入

优雅展平数组:

all_keypoints = np.array([item for sublist in all_keypoints_by_type for item in sublist])

reshape

流程图

lightweight-openpose学习笔记--源码解析(1)demo.py_第1张图片

你可能感兴趣的:(pytorch,python,深度学习)