darkknightzh

（原）人体姿态识别Light weight openpose

转载请注明出处：

https://www.cnblogs.com/darkknightzh/p/12152119.html

论文：

https://arxiv.org/abs/1811.12004

官方pytorch代码：

https://github.com/Daniil-Osokin/lightweight-human-pose-estimation.pytorch

1 简介

light weight openpose是openpose的简化版本，使用了openpose的大体流程。

Light weight openpose和openpose的区别是：

a 前者使用的是Mobilenet V1（到conv5_5），后者使用的是Vgg19（前10层）。

b 前者部分层使用了空洞卷积（dilated convolution）来提升感受视野，后者使用一般的卷积。

c 前者卷积核大小为3*3，后者为7*7。

d 前者只有一个refine stage，后者有5个stage。

e 前者的initial stage和refine stage里面的两个分支（hotmaps和pafs）使用权值共享，后者则是并行的两个分支。

2 改进

2.1 骨干网络

论文中分析了openpose各阶段的mAP及GFLOPs

发现从refine stage1之后，性能的提升不是非常明显，但是GFLOPs增加的相当明显，因而只保留了refine stage1，后面的都删除了。

2.2 权值共享

openpose的每个stage使用下图中左侧的两个并行的分支，分别预测hotmaps和pafs，为了进一步降低计算量，light weight openpose中将前几层进行权值共享，如下图右侧所示。

2.3 空洞卷积

进一步的，light weight openpose使用含有空洞卷积的mobilenet v1替换掉了vgg10，GFLOPs进一步降低了很多，如下图所示（下图中2-stage network中的那个n/a，是指使用所有的refine stage进行训练，但是使用的时候，只到refine stage 1，这样测试时的计算量不变，后几个阶段无计算量，因而为n/a，同时最后一栏GFLOPs还是9）。

**2.4 3*3 卷积**

为了和vgg19有相同的感受视野，light weight openpose中使用下面的卷积块来替代vgg19中的7*7卷积（具体的感受视野怎么计算的，不太清楚了。。。）。该图对应代码中的RefinementStageBlock。

3 训练过程

分三个阶段（不要和initial stage、refine stage弄混了）

a 使用MobileNet V1预训练的模型训练1个stage（initial stage + stage 1）的light weight openpose。此阶段mAP大约在38%。

b 使用a的结果继续训练light weight openpose。此阶段mAP大约在39%。

c 使用b的结果，将stage设置为3（initial stage + stage 1+ stage 2+ stage 3），继续训练light weight openpose；但是测试时，只使用stage=1时的结果估计姿态。此阶段mAP大约在40%。

注意：

a每次训练时，直接使用上次训练得到的最后一个模型重新训练，同时没有改学习率等参数。

b每个阶段验证时，为了节约时间，可以只在在验证集的子集上验证（和在整个验证集上性能差距很小）。

4 代码

4.1 整体网络结构

主要网络代码如下：

 1 class PoseEstimationWithMobileNet(nn.Module):
 2     def __init__(self, num_refinement_stages=1, num_channels=128, num_heatmaps=19, num_pafs=38):
 3         super().__init__()
 4         self.model = nn.Sequential(                     # mobilenet V1的骨干网络
 5             conv(     3,  32, stride=2, bias=False),    # conv+BN+ReLU
 6             conv_dw( 32,  64),                          # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
 7             conv_dw( 64, 128, stride=2),                # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
 8             conv_dw(128, 128),                          # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
 9             conv_dw(128, 256, stride=2),                # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
10             conv_dw(256, 256),                          # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
11             conv_dw(256, 512),         # conv4_2        # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
12             conv_dw(512, 512, dilation=2, padding=2),   # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
13             conv_dw(512, 512),                          # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
14             conv_dw(512, 512),                          # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
15             conv_dw(512, 512),                          # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
16             conv_dw(512, 512)   # conv5_5               # dw_conv(in,in, stride)+BN+ReLU + conv(in,out)+BN+ReLU
17         )
18         self.cpm = Cpm(512, num_channels)               # 降维模块
19 
20         self.initial_stage = InitialStage(num_channels, num_heatmaps, num_pafs)  # 初始阶段
21         self.refinement_stages = nn.ModuleList()
22         for idx in range(num_refinement_stages):
23             self.refinement_stages.append(RefinementStage(num_channels + num_heatmaps + num_pafs, num_channels, num_heatmaps, num_pafs))  # refine阶段
24 
25     def forward(self, x):
26         backbone_features = self.model(x)
27         backbone_features = self.cpm(backbone_features)
28 
29         stages_output = self.initial_stage(backbone_features)
30         for refinement_stage in self.refinement_stages:
31             stages_output.extend(refinement_stage(torch.cat([backbone_features, stages_output[-2], stages_output[-1]], dim=1)))
32 
33         return stages_output
34 
35 由于mobilenet V1输出为512维，有一个cpm的降维层，降维到128维，如下：
36 class Cpm(nn.Module):
37     def __init__(self, in_channels, out_channels):
38         super().__init__()
39         self.align = conv(in_channels, out_channels, kernel_size=1, padding=0, bn=False)  # conv+ReLU
40         self.trunk = nn.Sequential(
41             conv_dw_no_bn(out_channels, out_channels),                                    # dw_conv(in,in)+ELU + conv(in,out)+ELU
42             conv_dw_no_bn(out_channels, out_channels),                                    # dw_conv(in,in)+ELU + conv(in,out)+ELU
43             conv_dw_no_bn(out_channels, out_channels)                                     # dw_conv(in,in)+ELU + conv(in,out)+ELU
44         )
45         self.conv = conv(out_channels, out_channels, bn=False)                            # conv+ReLU
46 
47     def forward(self, x):
48         x = self.align(x)
49         x = self.conv(x + self.trunk(x))
50         return x

View Code

4.2 initial stage

 1 class InitialStage(nn.Module):
 2     def __init__(self, num_channels, num_heatmaps, num_pafs):
 3         super().__init__()
 4         self.trunk = nn.Sequential(                                                     # 权值共享
 5             conv(num_channels, num_channels, bn=False),                                 # conv+ReLU
 6             conv(num_channels, num_channels, bn=False),                                 # conv+ReLU
 7             conv(num_channels, num_channels, bn=False)                                  # conv+ReLU
 8         )
 9         self.heatmaps = nn.Sequential(                                                  # heatmaps
10             conv(num_channels, 512, kernel_size=1, padding=0, bn=False),                # 1*1conv+ReLU
11             conv(512, num_heatmaps, kernel_size=1, padding=0, bn=False, relu=False)     # 1*1conv
12         )
13         self.pafs = nn.Sequential(                                                      # pafs
14             conv(num_channels, 512, kernel_size=1, padding=0, bn=False),                # 1*1conv+ReLU
15             conv(512, num_pafs, kernel_size=1, padding=0, bn=False, relu=False)         # 1*1conv
16         )
17 
18     def forward(self, x):
19         trunk_features = self.trunk(x)
20         heatmaps = self.heatmaps(trunk_features)
21         pafs = self.pafs(trunk_features)
22         return [heatmaps, pafs]

View Code

4.3 refine stage

refine stage包括5个相同的RefinementStageBlock，用于权值共享。每个RefinementStageBlock如2.4所示。

 1 class RefinementStageBlock(nn.Module):
 2     def __init__(self, in_channels, out_channels):
 3         super().__init__()
 4         self.initial = conv(in_channels, out_channels, kernel_size=1, padding=0, bn=False)  # 1*1conv+ReLU
 5         self.trunk = nn.Sequential(
 6             conv(out_channels, out_channels),                                               # conv+BN+ReLU
 7             conv(out_channels, out_channels, dilation=2, padding=2)                         # conv+BN+ReLU
 8         )
 9 
10     def forward(self, x):
11         initial_features = self.initial(x)
12         trunk_features = self.trunk(initial_features)
13         return initial_features + trunk_features                                            # 论文中2个3*3conv代替7*7conv
14 
15 
16 class RefinementStage(nn.Module):
17     def __init__(self, in_channels, out_channels, num_heatmaps, num_pafs):
18         super().__init__()
19         self.trunk = nn.Sequential(                                                            # 权值共享
20             RefinementStageBlock(in_channels, out_channels),
21             RefinementStageBlock(out_channels, out_channels),
22             RefinementStageBlock(out_channels, out_channels),
23             RefinementStageBlock(out_channels, out_channels),
24             RefinementStageBlock(out_channels, out_channels)
25         )
26         self.heatmaps = nn.Sequential(                                                         # heatmaps
27             conv(out_channels, out_channels, kernel_size=1, padding=0, bn=False),              # 1*1conv+ReLU
28             conv(out_channels, num_heatmaps, kernel_size=1, padding=0, bn=False, relu=False)   # 1*1conv
29         )
30         self.pafs = nn.Sequential(                                                             # pafs
31             conv(out_channels, out_channels, kernel_size=1, padding=0, bn=False),              # 1*1conv+ReLU
32             conv(out_channels, num_pafs, kernel_size=1, padding=0, bn=False, relu=False)       # 1*1conv
33         )
34 
35     def forward(self, x):
36         trunk_features = self.trunk(x)
37         heatmaps = self.heatmaps(trunk_features)
38         pafs = self.pafs(trunk_features)
39         return [heatmaps, pafs]

View Code

4.4 各种自定义的conv

上面网络中使用的conv结构如下：

 1 def conv(in_channels, out_channels, kernel_size=3, padding=1, bn=True, dilation=1, stride=1, relu=True, bias=True):
 2     modules = [nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, dilation, bias=bias)]
 3     if bn:
 4         modules.append(nn.BatchNorm2d(out_channels))
 5     if relu:
 6         modules.append(nn.ReLU(inplace=True))
 7     return nn.Sequential(*modules)
 8 
 9 
10 def conv_dw(in_channels, out_channels, kernel_size=3, padding=1, stride=1, dilation=1):
11     return nn.Sequential(
12         nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, dilation=dilation, groups=in_channels, bias=False),
13         nn.BatchNorm2d(in_channels),
14         nn.ReLU(inplace=True),
15 
16         nn.Conv2d(in_channels, out_channels, 1, 1, 0, bias=False),
17         nn.BatchNorm2d(out_channels),
18         nn.ReLU(inplace=True),
19     )
20 
21 
22 def conv_dw_no_bn(in_channels, out_channels, kernel_size=3, padding=1, stride=1, dilation=1):
23     return nn.Sequential(
24         nn.Conv2d(in_channels, in_channels, kernel_size, stride, padding, dilation=dilation, groups=in_channels, bias=False),
25         nn.ELU(inplace=True),
26 
27         nn.Conv2d(in_channels, out_channels, 1, 1, 0, bias=False),
28         nn.ELU(inplace=True),
29     )

View Code

ELU激活函数如下：

4.5 损失函数

网络的损失函数如下，由于COCO数据库对某些很小的人没有标注，将这些地方的mask设置为0，防止这些人对训练造成干扰。

1 def l2_loss(input, target, mask, batch_size):
2     loss = (input - target) * mask
3     loss = (loss * loss) / 2 / batch_size
4 
5     return loss.sum()

View Code

如下图a为图像，b为mask_miss。COCO中把远处的人标注了，但是没有标注关节点信息，为了防止这些人干扰训练，因而才有了mask_miss。所有人的mask减去mask_miss，就是上面的mask了。

（a）

（b）

4.6 train

train用到了ConvertKeypoints，Scale Rotate，CropPad，Flip等变换，见4.7.

  1 def train(prepared_train_labels, train_images_folder, num_refinement_stages, base_lr, batch_size, batches_per_iter,
  2           num_workers, checkpoint_path, weights_only, from_mobilenet, checkpoints_folder, log_after,
  3           val_labels, val_images_folder, val_output_name, checkpoint_after, val_after):
  4     net = PoseEstimationWithMobileNet(num_refinement_stages)
  5 
  6     stride = 8  # 输入图像是特征图的倍数
  7     sigma = 7  # 生成关节点heatmaps时，高斯核的标准差
  8     path_thickness = 1  # 生成paf时躯干的宽度
  9     dataset = CocoTrainDataset(prepared_train_labels, train_images_folder,
 10                                stride, sigma, path_thickness,
 11                                transform=transforms.Compose([
 12                                    ConvertKeypoints(),
 13                                    Scale(),
 14                                    Rotate(pad=(128, 128, 128)),
 15                                    CropPad(pad=(128, 128, 128)),
 16                                    Flip()]))
 17     train_loader = DataLoader(dataset, batch_size=batch_size, shuffle=True, num_workers=num_workers)
 18 
 19     optimizer = optim.Adam([
 20         {'params': get_parameters_conv(net.model, 'weight')},
 21         {'params': get_parameters_conv_depthwise(net.model, 'weight'), 'weight_decay': 0},
 22         {'params': get_parameters_bn(net.model, 'weight'), 'weight_decay': 0},
 23         {'params': get_parameters_bn(net.model, 'bias'), 'lr': base_lr * 2, 'weight_decay': 0},
 24         {'params': get_parameters_conv(net.cpm, 'weight'), 'lr': base_lr},
 25         {'params': get_parameters_conv(net.cpm, 'bias'), 'lr': base_lr * 2, 'weight_decay': 0},
 26         {'params': get_parameters_conv_depthwise(net.cpm, 'weight'), 'weight_decay': 0},
 27         {'params': get_parameters_conv(net.initial_stage, 'weight'), 'lr': base_lr},
 28         {'params': get_parameters_conv(net.initial_stage, 'bias'), 'lr': base_lr * 2, 'weight_decay': 0},
 29         {'params': get_parameters_conv(net.refinement_stages, 'weight'), 'lr': base_lr * 4},
 30         {'params': get_parameters_conv(net.refinement_stages, 'bias'), 'lr': base_lr * 8, 'weight_decay': 0},
 31         {'params': get_parameters_bn(net.refinement_stages, 'weight'), 'weight_decay': 0},
 32         {'params': get_parameters_bn(net.refinement_stages, 'bias'), 'lr': base_lr * 2, 'weight_decay': 0},
 33     ], lr=base_lr, weight_decay=5e-4)
 34 
 35     num_iter = 0
 36     current_epoch = 0
 37     drop_after_epoch = [100, 200, 260]
 38     scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones=drop_after_epoch, gamma=0.333)
 39     if checkpoint_path:
 40         checkpoint = torch.load(checkpoint_path)
 41         if from_mobilenet:
 42             load_from_mobilenet(net, checkpoint)
 43         else:
 44             load_state(net, checkpoint)
 45             if not weights_only:
 46                 optimizer.load_state_dict(checkpoint['optimizer'])
 47                 scheduler.load_state_dict(checkpoint['scheduler'])
 48                 num_iter = checkpoint['iter']
 49                 current_epoch = checkpoint['current_epoch']
 50 
 51     net = DataParallel(net).cuda()
 52     net.train()
 53     for epochId in range(current_epoch, 280):
 54         scheduler.step()
 55         total_losses = [0, 0] * (num_refinement_stages + 1)  # heatmaps loss, paf loss per stage（initial stage + refine stage）
 56         batch_per_iter_idx = 0
 57         for batch_data in train_loader:
 58             if batch_per_iter_idx == 0:
 59                 optimizer.zero_grad()
 60 
 61             images = batch_data['image'].cuda()
 62             keypoint_masks = batch_data['keypoint_mask'].cuda()
 63             paf_masks = batch_data['paf_mask'].cuda()
 64             keypoint_maps = batch_data['keypoint_maps'].cuda()
 65             paf_maps = batch_data['paf_maps'].cuda()
 66 
 67             stages_output = net(images)
 68 
 69             losses = []
 70             for loss_idx in range(len(total_losses) // 2):
 71                 losses.append(l2_loss(stages_output[loss_idx * 2], keypoint_maps, keypoint_masks, images.shape[0]))  # 2i维为热图
 72                 losses.append(l2_loss(stages_output[loss_idx * 2 + 1], paf_maps, paf_masks, images.shape[0]))   # 2i+1维为paf
 73                 total_losses[loss_idx * 2] += losses[-2].item() / batches_per_iter  # 累积loss
 74                 total_losses[loss_idx * 2 + 1] += losses[-1].item() / batches_per_iter  # 累积loss
 75 
 76             loss = losses[0]
 77             for loss_idx in range(1, len(losses)):
 78                 loss += losses[loss_idx]  # 计算所有stage的loss
 79             loss /= batches_per_iter  # loss平均
 80             loss.backward()
 81             batch_per_iter_idx += 1
 82             if batch_per_iter_idx == batches_per_iter:
 83                 optimizer.step()
 84                 batch_per_iter_idx = 0
 85                 num_iter += 1
 86             else:
 87                 continue
 88 
 89             if num_iter % log_after == 0:
 90                 print('Iter: {}'.format(num_iter))
 91                 for loss_idx in range(len(total_losses) // 2):
 92                     print('\n'.join(['stage{}_pafs_loss:     {}', 'stage{}_heatmaps_loss: {}']).format(
 93                         loss_idx + 1, total_losses[loss_idx * 2 + 1] / log_after, loss_idx + 1, total_losses[loss_idx * 2] / log_after))
 94                 for loss_idx in range(len(total_losses)):
 95                     total_losses[loss_idx] = 0
 96             if num_iter % checkpoint_after == 0:
 97                 snapshot_name = '{}/checkpoint_iter_{}.pth'.format(checkpoints_folder, num_iter)
 98                 torch.save({'state_dict': net.module.state_dict(),
 99                             'optimizer': optimizer.state_dict(),
100                             'scheduler': scheduler.state_dict(),
101                             'iter': num_iter,
102                             'current_epoch': epochId},
103                            snapshot_name)
104            # if num_iter % val_after == 0:
105                 #print('Validation...')
106                 #evaluate(val_labels, val_output_name, val_images_folder, net)
107                 #net.train()

View Code

4.7 transformations

transformations主要包括ConvertKeypoints，Scale Rotate，CropPad，Flip等变换。

4.7.1 ConvertKeypoints

ConvertKeypoints用于将coco的关键点顺序变换到代码中的关键点顺序。

 1 class ConvertKeypoints(object):
 2     def __call__(self, sample):
 3         label = sample['label']
 4         h, w, _ = sample['image'].shape
 5         keypoints = label['keypoints']  # keypoint[2]=0: 遮挡  1：可见  2：不在图像内
 6         for keypoint in keypoints:  # keypoint[2] == 0: occluded, == 1: visible, == 2: not in image
 7             if keypoint[0] == keypoint[1] == 0:
 8                 keypoint[2] = 2
 9             if (keypoint[0] < 0 or keypoint[0] >= w or keypoint[1] < 0 or keypoint[1] >= h):
10                 keypoint[2] = 2
11         for other_label in label['processed_other_annotations']:
12             keypoints = other_label['keypoints']
13             for keypoint in keypoints:
14                 if keypoint[0] == keypoint[1] == 0:
15                     keypoint[2] = 2
16                 if (keypoint[0] < 0 or keypoint[0] >= w or keypoint[1] < 0 or keypoint[1] >= h):
17                     keypoint[2] = 2
18         label['keypoints'] = self._convert(label['keypoints'], w, h)  # 变成文中关节点的顺序，同时增加脖子
19 
20         for other_label in label['processed_other_annotations']:
21             other_label['keypoints'] = self._convert(other_label['keypoints'], w, h)  # 变成文中关节点的顺序，同时增加脖子
22         return sample
23 
24     def _convert(self, keypoints, w, h):
25         # Nose, Neck, R hand, L hand, R leg, L leg, Eyes, Ears
26         reorder_map = [1, 7, 9, 11, 6, 8, 10, 13, 15, 17, 12, 14, 16, 3, 2, 5, 4]  # COCO关节点到文中关节点的映射
27         converted_keypoints = list(keypoints[i - 1] for i in reorder_map)  # 映射到文中的关节点顺序
28         # Add neck as a mean of shoulders
29         converted_keypoints.insert(1, [(keypoints[5][0] + keypoints[6][0]) / 2, (keypoints[5][1] + keypoints[6][1]) / 2, 0])  # 增加脖子
30         if keypoints[5][2] == 2 and keypoints[6][2] == 2:
31             converted_keypoints[1][2] = 2
32         elif keypoints[5][2] == 3 and keypoints[6][2] == 3:
33             converted_keypoints[1][2] = 3
34         elif keypoints[5][2] == 1 and keypoints[6][2] == 1:
35             converted_keypoints[1][2] = 1
36         if (converted_keypoints[1][0] < 0 or converted_keypoints[1][0] >= w or converted_keypoints[1][1] < 0 or converted_keypoints[1][1] >= h):
37             converted_keypoints[1][2] = 2
38         return converted_keypoints

View Code

其中coco和代码中的关键点顺序分别如下图所示，通过reorder_map中的值-1变换，并插入neck。

4.7.2 Scale

Scale用于缩放图像及关键点信息。

 1 class Scale(object):
 2     def __init__(self, prob=1, min_scale=0.5, max_scale=1.1, target_dist=0.6):
 3         self._prob = prob
 4         self._min_scale = min_scale
 5         self._max_scale = max_scale
 6         self._target_dist = target_dist
 7 
 8     def __call__(self, sample):
 9         prob = random.random()
10         scale_multiplier = 1
11         if prob <= self._prob:
12             prob = random.random()
13             scale_multiplier = (self._max_scale - self._min_scale) * prob + self._min_scale
14         label = sample['label']
15         scale_abs = self._target_dist / label['scale_provided']
16         scale = scale_abs * scale_multiplier
17         sample['image'] = cv2.resize(sample['image'], dsize=(0, 0), fx=scale, fy=scale)
18         label['img_height'], label['img_width'], _ = sample['image'].shape
19         sample['mask'] = cv2.resize(sample['mask'], dsize=(0, 0), fx=scale, fy=scale)
20 
21         label['objpos'][0] *= scale
22         label['objpos'][1] *= scale
23         for keypoint in sample['label']['keypoints']:
24             keypoint[0] *= scale
25             keypoint[1] *= scale
26         for other_annotation in sample['label']['processed_other_annotations']:
27             other_annotation['objpos'][0] *= scale
28             other_annotation['objpos'][1] *= scale
29             for keypoint in other_annotation['keypoints']:
30                 keypoint[0] *= scale
31                 keypoint[1] *= scale
32         return sample

View Code

4.7.3 Rotate

Rotate用于旋转图像及关键点信息。

 1 class Rotate(object):
 2     def __init__(self, pad, max_rotate_degree=40):
 3         self._pad = pad
 4         self._max_rotate_degree = max_rotate_degree
 5 
 6     def __call__(self, sample):
 7         prob = random.random()
 8         degree = (prob - 0.5) * 2 * self._max_rotate_degree
 9         h, w, _ = sample['image'].shape
10         img_center = (w / 2, h / 2)
11         R = cv2.getRotationMatrix2D(img_center, degree, 1)
12 
13         abs_cos = abs(R[0, 0])
14         abs_sin = abs(R[0, 1])
15 
16         bound_w = int(h * abs_sin + w * abs_cos)
17         bound_h = int(h * abs_cos + w * abs_sin)
18         dsize = (bound_w, bound_h)
19 
20         R[0, 2] += dsize[0] / 2 - img_center[0]
21         R[1, 2] += dsize[1] / 2 - img_center[1]
22         sample['image'] = cv2.warpAffine(sample['image'], R, dsize=dsize, borderMode=cv2.BORDER_CONSTANT, borderValue=self._pad)
23         sample['label']['img_height'], sample['label']['img_width'], _ = sample['image'].shape
24         sample['mask'] = cv2.warpAffine(sample['mask'], R, dsize=dsize, borderMode=cv2.BORDER_CONSTANT, borderValue=(1, 1, 1))  # border is ok
25         label = sample['label']
26         label['objpos'] = self._rotate(label['objpos'], R)  # 旋转位置坐标
27         for keypoint in label['keypoints']:
28             point = [keypoint[0], keypoint[1]]
29             point = self._rotate(point, R)  # 旋转位置坐标
30             keypoint[0], keypoint[1] = point[0], point[1]
31         for other_annotation in label['processed_other_annotations']:
32             for keypoint in other_annotation['keypoints']:
33                 point = [keypoint[0], keypoint[1]]
34                 point = self._rotate(point, R)  # 旋转位置坐标
35                 keypoint[0], keypoint[1] = point[0], point[1]
36         return sample
37 
38     def _rotate(self, point, R):
39         return [R[0, 0] * point[0] + R[0, 1] * point[1] + R[0, 2], R[1, 0] * point[0] + R[1, 1] * point[1] + R[1, 2]]

View Code

4.7.4 CropPad

CropPad用于随机裁剪

 1 class CropPad(object):
 2     def __init__(self, pad, center_perterb_max=40, crop_x=368, crop_y=368):
 3         self._pad = pad
 4         self._center_perterb_max = center_perterb_max
 5         self._crop_x = crop_x
 6         self._crop_y = crop_y
 7 
 8     def __call__(self, sample):
 9         prob_x = random.random()
10         prob_y = random.random()
11 
12         offset_x = int((prob_x - 0.5) * 2 * self._center_perterb_max)
13         offset_y = int((prob_y - 0.5) * 2 * self._center_perterb_max)
14         label = sample['label']
15         shifted_center = (label['objpos'][0] + offset_x, label['objpos'][1] + offset_y)
16         offset_left = -int(shifted_center[0] - self._crop_x / 2)
17         offset_up = -int(shifted_center[1] - self._crop_y / 2)
18 
19         cropped_image = np.empty(shape=(self._crop_y, self._crop_x, 3), dtype=np.uint8)
20         for i in range(3):
21             cropped_image[:, :, i].fill(self._pad[i])
22         cropped_mask = np.empty(shape=(self._crop_y, self._crop_x), dtype=np.uint8)
23         cropped_mask.fill(1)
24 
25         image_x_start = int(shifted_center[0] - self._crop_x / 2)
26         image_y_start = int(shifted_center[1] - self._crop_y / 2)
27         image_x_finish = image_x_start + self._crop_x
28         image_y_finish = image_y_start + self._crop_y
29         crop_x_start = 0
30         crop_y_start = 0
31         crop_x_finish = self._crop_x
32         crop_y_finish = self._crop_y
33 
34         w, h = label['img_width'], label['img_height']
35         should_crop = True
36         if image_x_start < 0:  # Adjust crop area
37             crop_x_start -= image_x_start
38             image_x_start = 0
39         if image_x_start >= w:
40             should_crop = False
41 
42         if image_y_start < 0:
43             crop_y_start -= image_y_start
44             image_y_start = 0
45         if image_y_start >= w:
46             should_crop = False
47 
48         if image_x_finish > w:
49             diff = image_x_finish - w
50             image_x_finish -= diff
51             crop_x_finish -= diff
52         if image_x_finish < 0:
53             should_crop = False
54 
55         if image_y_finish > h:
56             diff = image_y_finish - h
57             image_y_finish -= diff
58             crop_y_finish -= diff
59         if image_y_finish < 0:
60             should_crop = False
61 
62         if should_crop:
63             cropped_image[crop_y_start:crop_y_finish, crop_x_start:crop_x_finish, :] =\
64                 sample['image'][image_y_start:image_y_finish, image_x_start:image_x_finish, :]
65             cropped_mask[crop_y_start:crop_y_finish, crop_x_start:crop_x_finish] =\
66                 sample['mask'][image_y_start:image_y_finish, image_x_start:image_x_finish]
67 
68         sample['image'] = cropped_image
69         sample['mask'] = cropped_mask
70         label['img_width'] = self._crop_x
71         label['img_height'] = self._crop_y
72 
73         label['objpos'][0] += offset_left
74         label['objpos'][1] += offset_up
75         for keypoint in label['keypoints']:
76             keypoint[0] += offset_left
77             keypoint[1] += offset_up
78         for other_annotation in label['processed_other_annotations']:
79             for keypoint in other_annotation['keypoints']:
80                 keypoint[0] += offset_left
81                 keypoint[1] += offset_up
82 
83         return sample
84 
85     def _inside(self, point, width, height):
86         if point[0] < 0 or point[1] < 0:
87             return False
88         if point[0] >= width or point[1] >= height:
89             return False
90         return True

View Code

4.7.5 Flip

此处的Flip，用于在训练阶段左右镜像图像。此时只需要将关键点对应位置左右互换（如_swap_left_right中的right和left），由于还未得到paf，因而不需要对paf进行任何处理。

 1 class Flip(object):
 2     def __init__(self, prob=0.5):
 3         self._prob = prob
 4 
 5     def __call__(self, sample):
 6         prob = random.random()
 7         do_flip = prob <= self._prob
 8         if not do_flip:
 9             return sample
10 
11         sample['image'] = cv2.flip(sample['image'], 1)
12         sample['mask'] = cv2.flip(sample['mask'], 1)
13 
14         label = sample['label']
15         w, h = label['img_width'], label['img_height']
16         label['objpos'][0] = w - 1 - label['objpos'][0]
17         for keypoint in label['keypoints']:
18             keypoint[0] = w - 1 - keypoint[0]
19         label['keypoints'] = self._swap_left_right(label['keypoints'])  # 交换左右关节点
20 
21         for other_annotation in label['processed_other_annotations']:
22             other_annotation['objpos'][0] = w - 1 - other_annotation['objpos'][0]   # 水平镜像，只宽度需要重新计算
23             for keypoint in other_annotation['keypoints']:
24                 keypoint[0] = w - 1 - keypoint[0]
25             other_annotation['keypoints'] = self._swap_left_right(other_annotation['keypoints'])   # 交换左右关节点
26 
27         return sample
28 
29     def _swap_left_right(self, keypoints):
30         right = [2, 3, 4, 8, 9, 10, 14, 16]   # 左右关节点索引
31         left = [5, 6, 7, 11, 12, 13, 15, 17]
32         for r, l in zip(right, left):
33             keypoints[r], keypoints[l] = keypoints[l], keypoints[r]
34         return keypoints

View Code

4.8 val

val的代码没啥好说的，也就是convert_to_coco_format

 1 def convert_to_coco_format(pose_entries, all_keypoints):
 2     coco_keypoints = []
 3     scores = []
 4     for n in range(len(pose_entries)):
 5         if len(pose_entries[n]) == 0:
 6             continue
 7         keypoints = [0] * 17 * 3
 8         to_coco_map = [0, -1, 6, 8, 10, 5, 7, 9, 12, 14, 16, 11, 13, 15, 2, 1, 4, 3]
 9         person_score = pose_entries[n][-2]
10         position_id = -1
11         for keypoint_id in pose_entries[n][:-2]:  # 最后一个为分配给当前人的关节点的数量，倒数第二个为得分。因而去掉这两个。
12             position_id += 1
13             if position_id == 1:  # no 'neck' in COCO。COCO中没有neck，而本代码中neck的idx为1，因而idx为1时，continue
14                 continue
15 
16             cx, cy, score, visibility = 0, 0, 0, 0  # keypoint not found
17             if keypoint_id != -1:
18                 cx, cy, score = all_keypoints[int(keypoint_id), 0:3]
19                 cx = cx + 0.5
20                 cy = cy + 0.5
21                 visibility = 1
22             keypoints[to_coco_map[position_id] * 3 + 0] = cx
23             keypoints[to_coco_map[position_id] * 3 + 1] = cy
24             keypoints[to_coco_map[position_id] * 3 + 2] = visibility
25         coco_keypoints.append(keypoints)
26         scores.append(person_score * max(0, (pose_entries[n][-1] - 1)))  # -1 for 'neck'
27     return coco_keypoints, scores

View Code

4.9 gt label的生成

gt label通过coco.py生成，如下。其中BODY_PARTS_KPT_IDS将4.7中openpose的关键点映射到下面的躯干。

  1 BODY_PARTS_KPT_IDS = [[1, 8], [8, 9], [9, 10], [1, 11], [11, 12], [12, 13], [1, 2], [2, 3], [3, 4], [2, 16],
  2                       [1, 5], [5, 6], [6, 7], [5, 17], [1, 0], [0, 14], [0, 15], [14, 16], [15, 17]]
  3 
  4 
  5 def get_mask(segmentations, mask):
  6     for segmentation in segmentations:
  7         rle = pycocotools.mask.frPyObjects(segmentation, mask.shape[0], mask.shape[1])
  8         mask[pycocotools.mask.decode(rle) > 0.5] = 0
  9     return mask
 10 
 11 
 12 class CocoTrainDataset(Dataset):
 13     def __init__(self, labels, images_folder, stride, sigma, paf_thickness, transform=None):
 14         super().__init__()
 15         self._images_folder = images_folder
 16         self._stride = stride
 17         self._sigma = sigma
 18         self._paf_thickness = paf_thickness
 19         self._transform = transform
 20         with open(labels, 'rb') as f:
 21             self._labels = pickle.load(f)
 22 
 23     def __getitem__(self, idx):
 24         label = copy.deepcopy(self._labels[idx])  # label modified in transform
 25         image = cv2.imread(os.path.join(self._images_folder, label['img_paths']), cv2.IMREAD_COLOR)
 26         mask = np.ones(shape=(label['img_height'], label['img_width']), dtype=np.float32)
 27         mask = get_mask(label['segmentations'], mask)
 28         sample = {'label': label, 'image': image, 'mask': mask}
 29         if self._transform:
 30             sample = self._transform(sample)
 31 
 32         mask = cv2.resize(sample['mask'], dsize=None, fx=1/self._stride, fy=1/self._stride, interpolation=cv2.INTER_AREA)
 33         keypoint_maps = self._generate_keypoint_maps(sample)  # 生成高斯分布的热图
 34         sample['keypoint_maps'] = keypoint_maps
 35         keypoint_mask = np.zeros(shape=keypoint_maps.shape, dtype=np.float32) # 热图的mask
 36         for idx in range(keypoint_mask.shape[0]):
 37             keypoint_mask[idx] = mask  # 将实际mask复制到热图mask的每一层上面
 38         sample['keypoint_mask'] = keypoint_mask
 39 
 40         paf_maps = self._generate_paf_maps(sample)  # 增加paf
 41         sample['paf_maps'] = paf_maps
 42         paf_mask = np.zeros(shape=paf_maps.shape, dtype=np.float32)
 43         for idx in range(paf_mask.shape[0]):
 44             paf_mask[idx] = mask  # 将实际mask复制到paf mask的每一层上面
 45         sample['paf_mask'] = paf_mask
 46 
 47         image = sample['image'].astype(np.float32)
 48         image = (image - 128) / 256  # 归一化
 49         sample['image'] = image.transpose((2, 0, 1))  # bgr to rgb
 50         return sample
 51 
 52     def __len__(self):
 53         return len(self._labels)
 54 
 55     def _generate_keypoint_maps(self, sample):
 56         n_keypoints = 18  # 关节点总数量
 57         n_rows, n_cols, _ = sample['image'].shape
 58         keypoint_maps = np.zeros(shape=(n_keypoints + 1, n_rows // self._stride, n_cols // self._stride), dtype=np.float32)  # +1 for bg，增加背景
 59 
 60         label = sample['label']
 61         for keypoint_idx in range(n_keypoints):
 62             keypoint = label['keypoints'][keypoint_idx]
 63             if keypoint[2] <= 1:
 64                 self._add_gaussian(keypoint_maps[keypoint_idx], keypoint[0], keypoint[1], self._stride, self._sigma)   # 热图每一层增加高斯分布的热图
 65             for another_annotation in label['processed_other_annotations']:
 66                 keypoint = another_annotation['keypoints'][keypoint_idx]
 67                 if keypoint[2] <= 1:
 68                     self._add_gaussian(keypoint_maps[keypoint_idx], keypoint[0], keypoint[1], self._stride, self._sigma)   # 热图每一层增加高斯分布的热图
 69         keypoint_maps[-1] = 1 - keypoint_maps.max(axis=0)  # 背景
 70         return keypoint_maps
 71 
 72     def _add_gaussian(self, keypoint_map, x, y, stride, sigma):
 73         n_sigma = 4
 74         tl = [int(x - n_sigma * sigma), int(y - n_sigma * sigma)]  # 根据当前坐标，算出在4sigma内的起点和终点，此处为起点
 75         tl[0] = max(tl[0], 0)
 76         tl[1] = max(tl[1], 0)
 77 
 78         br = [int(x + n_sigma * sigma), int(y + n_sigma * sigma)]  # 根据当前坐标，算出在4sigma内的起点和终点，此处为终点
 79         map_h, map_w = keypoint_map.shape  # 特征图大小
 80         br[0] = min(br[0], map_w * stride)  # 放大回原始图像大小
 81         br[1] = min(br[1], map_h * stride)  # 放大回原始图像大小
 82 
 83         shift = stride / 2 - 0.5
 84         for map_y in range(tl[1] // stride, br[1] // stride):      # y在特征图上的范围
 85             for map_x in range(tl[0] // stride, br[0] // stride):  # x在特征图上的范围
 86                 d2 = (map_x * stride + shift - x) * (map_x * stride + shift - x) + (map_y * stride + shift - y) * (map_y * stride + shift - y) # 距离的平方
 87                 exponent = d2 / 2 / sigma / sigma
 88                 if exponent > 4.6052:  # threshold, ln(100), ~0.01
 89                     continue
 90                 keypoint_map[map_y, map_x] += math.exp(-exponent)   # 不同关节点热图求和，而非像论文中那样使用max
 91                 if keypoint_map[map_y, map_x] > 1:
 92                     keypoint_map[map_y, map_x] = 1
 93 
 94     def _generate_paf_maps(self, sample):
 95         n_pafs = len(BODY_PARTS_KPT_IDS)
 96         n_rows, n_cols, _ = sample['image'].shape
 97         paf_maps = np.zeros(shape=(n_pafs * 2, n_rows // self._stride, n_cols // self._stride), dtype=np.float32)
 98 
 99         label = sample['label']
100         for paf_idx in range(n_pafs):
101             keypoint_a = label['keypoints'][BODY_PARTS_KPT_IDS[paf_idx][0]]  # 当前躯干起点
102             keypoint_b = label['keypoints'][BODY_PARTS_KPT_IDS[paf_idx][1]]  # 当前躯干终点
103             if keypoint_a[2] <= 1 and keypoint_b[2] <= 1:  # 起点和终点均在图像内，则增加paf
104                 self._set_paf(paf_maps[paf_idx * 2:paf_idx * 2 + 2], keypoint_a[0], keypoint_a[1], keypoint_b[0], keypoint_b[1], self._stride, self._paf_thickness)
105             for another_annotation in label['processed_other_annotations']:
106                 keypoint_a = another_annotation['keypoints'][BODY_PARTS_KPT_IDS[paf_idx][0]]   # 当前躯干起点
107                 keypoint_b = another_annotation['keypoints'][BODY_PARTS_KPT_IDS[paf_idx][1]]   # 当前躯干终点
108                 if keypoint_a[2] <= 1 and keypoint_b[2] <= 1:   # 起点和终点均在图像内，则增加paf
109                     self._set_paf(paf_maps[paf_idx * 2:paf_idx * 2 + 2], keypoint_a[0], keypoint_a[1], keypoint_b[0], keypoint_b[1], self._stride, self._paf_thickness)
110         return paf_maps
111 
112     def _set_paf(self, paf_map, x_a, y_a, x_b, y_b, stride, thickness):
113         x_a /= stride  # 原始坐标映射到特征图上坐标
114         y_a /= stride
115         x_b /= stride
116         y_b /= stride
117         x_ba = x_b - x_a  # x方向长度
118         y_ba = y_b - y_a  # y方向长度
119         _, h_map, w_map = paf_map.shape
120         x_min = int(max(min(x_a, x_b) - thickness, 0))  # 起点到终点的方框四周增加thickness个像素
121         x_max = int(min(max(x_a, x_b) + thickness, w_map))
122         y_min = int(max(min(y_a, y_b) - thickness, 0))
123         y_max = int(min(max(y_a, y_b) + thickness, h_map))
124         norm_ba = (x_ba * x_ba + y_ba * y_ba) ** 0.5  # 起点指向终点的向量的模长
125         if norm_ba < 1e-7:  # Same points, no paf
126             return
127         x_ba /= norm_ba  #  起点指向终点的单位向量的x长度
128         y_ba /= norm_ba  #  起点指向终点的单位向量的y长度
129 
130         for y in range(y_min, y_max):  # 依次遍历该方框中每一个点
131             for x in range(x_min, x_max):
132                 x_ca = x - x_a  # 起点指向当前点的向量
133                 y_ca = y - y_a
134                 d = math.fabs(x_ca * y_ba - y_ca * x_ba)  # 起点指向当前点的向量在起点指向终点的单位向量垂直的单位向量上的投影
135                 if d <= thickness:  # 投影小于阈值，则增加该单位向量到paf对应躯干中
136                     paf_map[0, y, x] = x_ba
137                     paf_map[1, y, x] = y_ba
138 
139 
140 class CocoValDataset(Dataset):
141     def __init__(self, labels, images_folder):
142         super().__init__()
143         with open(labels, 'r') as f:
144             self._labels = json.load(f)
145         self._images_folder = images_folder
146 
147     def __getitem__(self, idx):
148         file_name = self._labels['images'][idx]['file_name']
149         img = cv2.imread(os.path.join(self._images_folder, file_name), cv2.IMREAD_COLOR)
150         return {'img': img, 'file_name': file_name}
151 
152     def __len__(self):
153         return len(self._labels['images'])

View Code

注意：_add_gaussian的最后两行，合并多个高斯confidence maps时，没有使用论文中的max，而是使用min(sum(peaks), 1)。此处和官方openpose代码一致，该文件位于caffe_train-master/src/caffe/cpm_data_transformer.cpp，具体代码如下：

另一方面，_set_paf函数最后两行，直将当前的单位向量增加到pafs中。若一个人某躯干将另一个人相同的躯干遮挡（或出现交叉的情况），则只会计算某一个躯干（依遍历顺序而定），但是实际上这种情况发生的概率应该相当低。

4.10 extract_keypoints和group_keypoints

在提取关节点extract_keypoints的函数中，给每个提取到的关节点分配了一个索引，这样所有的关节点索引均不相同。在group_keypoints 中，将这个索引放到pose_entries对应的位置，这样不会有关节点被分配给2个人。如下面（a）、（b）两个图所示。

（a）

（b）

keypoints.py如下：

  1 # 本文件中新的paf顺序，不确定为何不用coco.py中原始的顺序？？？
  2 BODY_PARTS_KPT_IDS = [[1, 2], [1, 5], [2, 3], [3, 4], [5, 6], [6, 7], [1, 8], [8, 9], [9, 10], [1, 11],
  3                       [11, 12], [12, 13], [1, 0], [0, 14], [14, 16], [0, 15], [15, 17], [2, 16], [5, 17]]
  4 # 本文件中新的paf顺序在原始paf(coco.py)中的x和y坐标的索引
  5 BODY_PARTS_PAF_IDS = ([12, 13], [20, 21], [14, 15], [16, 17], [22, 23], [24, 25], [0, 1], [2, 3], [4, 5], [6, 7],
  6                       [8, 9], [10, 11], [28, 29], [30, 31], [34, 35], [32, 33], [36, 37], [18, 19], [26, 27])
  7 
  8 
  9 def linspace2d(start, stop, n=10):
 10     points = 1 / (n - 1) * (stop - start)  # 起点和终点之间插值点,包括终点共n个
 11     return points[:, None] * np.arange(n) + start[:, None]
 12 
 13 
 14 def extract_keypoints(heatmap, all_keypoints, total_keypoint_num):
 15     heatmap[heatmap < 0.1] = 0  # 热图中小于阈值的置0
 16     heatmap_with_borders = np.pad(heatmap, [(2, 2), (2, 2)], mode='constant')  # 边界各填充2个像素
 17     heatmap_center = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 1:heatmap_with_borders.shape[1]-1]  # heatmap_center中心，比热图四边各多1个像素
 18     heatmap_left = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 2:heatmap_with_borders.shape[1]] # 实际上为热图右边的图
 19     heatmap_right = heatmap_with_borders[1:heatmap_with_borders.shape[0]-1, 0:heatmap_with_borders.shape[1]-2]  # 实际上为热图左边的图
 20     heatmap_up = heatmap_with_borders[2:heatmap_with_borders.shape[0], 1:heatmap_with_borders.shape[1]-1]  # 实际上为热图下边的图
 21     heatmap_down = heatmap_with_borders[0:heatmap_with_borders.shape[0]-2, 1:heatmap_with_borders.shape[1]-1]  # 实际上为热图上边的图
 22 
 23     heatmap_peaks = (heatmap_center > heatmap_left) & (heatmap_center > heatmap_right) &\
 24                     (heatmap_center > heatmap_up) & (heatmap_center > heatmap_down)  # 热图当前像素比上下左右的热图的像素都大的，为峰值
 25     heatmap_peaks = heatmap_peaks[1:heatmap_center.shape[0]-1, 1:heatmap_center.shape[1]-1]  # 得到和原始的热图一样大的热图
 26     keypoints = list(zip(np.nonzero(heatmap_peaks)[1], np.nonzero(heatmap_peaks)[0]))  # (w, h)  得到峰值（关节点）的xy坐标 np.nonzero得到2*N向量，0为x，1为y
 27     keypoints = sorted(keypoints, key=itemgetter(0))  # 按照x坐标从小到大排序
 28 
 29     suppressed = np.zeros(len(keypoints), np.uint8)  # 第i个坐标(关节点)应该被抑制的flag
 30     keypoints_with_score_and_id = []
 31     keypoint_num = 0
 32     for i in range(len(keypoints)):
 33         if suppressed[i]:
 34             continue
 35         for j in range(i+1, len(keypoints)):  # 依次比较第i点和后面所有j点距离的平方的和，小于阈值，则抑制后面第j个点
 36             if math.sqrt((keypoints[i][0] - keypoints[j][0]) ** 2 + (keypoints[i][1] - keypoints[j][1]) ** 2) < 6:
 37                 suppressed[j] = 1
 38         keypoint_with_score_and_id = (keypoints[i][0], keypoints[i][1], heatmap[keypoints[i][1], keypoints[i][0]], total_keypoint_num + keypoint_num)
 39         keypoints_with_score_and_id.append(keypoint_with_score_and_id)  # 当前点的x、y坐标，当前点热图值，当前点在所有特征点中的index
 40         keypoint_num += 1  # 特征点数量+1
 41     all_keypoints.append(keypoints_with_score_and_id)  # 将当前热图上检测到的所有关节点添加到所有关节点中
 42     return keypoint_num  # 返回总共特征点的数量
 43 
 44 
 45 def group_keypoints(all_keypoints_by_type, pafs, pose_entry_size=20, min_paf_score=0.05, demo=False):
 46     pose_entries = []
 47     all_keypoints = np.array([item for sublist in all_keypoints_by_type for item in sublist]) # 将所有关节点展开成N*4的array
 48     for part_id in range(len(BODY_PARTS_PAF_IDS)):  # 将躯干某个连接的单位向量映射到paf对应的通道
 49         part_pafs = pafs[:, :, BODY_PARTS_PAF_IDS[part_id]] # 得到当前躯干的2维单位向量（xy）
 50         kpts_a = all_keypoints_by_type[BODY_PARTS_KPT_IDS[part_id][0]]  # 当前躯干所有起点  BODY_PARTS_KPT_IDS为将关节点连接成躯干的映射
 51         kpts_b = all_keypoints_by_type[BODY_PARTS_KPT_IDS[part_id][1]]  # 当前躯干所有终点  kpts_a和kpts_b为[]，里面可能有几个4维向量，也可能为空
 52         num_kpts_a = len(kpts_a)  # 起点个数
 53         num_kpts_b = len(kpts_b)  # 终点个数
 54         kpt_a_id = BODY_PARTS_KPT_IDS[part_id][0]  # 当前躯干起点的id
 55         kpt_b_id = BODY_PARTS_KPT_IDS[part_id][1]  # 当前躯干终点的id
 56 
 57         if num_kpts_a == 0 and num_kpts_b == 0:  # no keypoints for such body part # 当前躯干无关节点
 58             continue
 59         elif num_kpts_a == 0:  # body part has just 'b' keypoints  当前躯干只有终点的关节点
 60             for i in range(num_kpts_b):  # 依次遍历所有终点
 61                 num = 0
 62                 for j in range(len(pose_entries)):  # check if already in some pose, was added by another body part 和已经分配的所有人依次比较
 63                     if pose_entries[j][kpt_b_id] == kpts_b[i][3]:  # 如果当前终点已经分配给了某个人
 64                         num += 1  # 数量+1
 65                         continue  # 退出此处for j的循环
 66                 if num == 0: # 当前终点未分配给任何人，则新建一个人
 67                     pose_entry = np.ones(pose_entry_size) * -1
 68                     pose_entry[kpt_b_id] = kpts_b[i][3]  # keypoint idx
 69                     pose_entry[-1] = 1                   # num keypoints in pose
 70                     pose_entry[-2] = kpts_b[i][2]        # pose score
 71                     pose_entries.append(pose_entry)
 72             continue
 73         elif num_kpts_b == 0:  # body part has just 'a' keypoints  当前躯干只有起点的关节点
 74             for i in range(num_kpts_a):  # 依次遍历所有起点
 75                 num = 0
 76                 for j in range(len(pose_entries)):  # 和分配的所有人依次比较
 77                     if pose_entries[j][kpt_a_id] == kpts_a[i][3]:  # 如果当前起点已经分配给了某个人
 78                         num += 1  # 数量+1
 79                         continue  # 退出此处for j的循环
 80                 if num == 0: # 当前起点未分配给任何人，则新建一个人
 81                     pose_entry = np.ones(pose_entry_size) * -1
 82                     pose_entry[kpt_a_id] = kpts_a[i][3]
 83                     pose_entry[-1] = 1
 84                     pose_entry[-2] = kpts_a[i][2]
 85                     pose_entries.append(pose_entry)
 86             continue
 87 
 88         connections = []                             # 躯干的连接 # 当前躯干起点和终点都有关节点
 89         for i in range(num_kpts_a):                  # 依次遍历起点的每个关节点
 90             kpt_a = np.array(kpts_a[i][0:2])         # 起点当前关节点的坐标
 91             for j in range(num_kpts_b):              # 依次遍历终点的每个关节点
 92                 kpt_b = np.array(kpts_b[j][0:2])     # 终点当前关节点的坐标
 93                 mid_point = [(), ()]
 94                 mid_point[0] = (int(round((kpt_a[0] + kpt_b[0]) * 0.5)), int(round((kpt_a[1] + kpt_b[1]) * 0.5)))
 95                 mid_point[1] = mid_point[0]  # 起点和终点的中点
 96 
 97                 vec = [kpt_b[0] - kpt_a[0], kpt_b[1] - kpt_a[1]]  # 起点指向终点的单位向量
 98                 vec_norm = math.sqrt(vec[0] ** 2 + vec[1] ** 2)
 99                 if vec_norm == 0:
100                     continue
101                 vec[0] /= vec_norm
102                 vec[1] /= vec_norm
103                 cur_point_score = (vec[0] * part_pafs[mid_point[0][1], mid_point[0][0], 0] +  # part_pafs第0维为y索引，第1维为x索引，第2维为paf单位
104                                    vec[1] * part_pafs[mid_point[1][1], mid_point[1][0], 1])   # 向量的x或者y索引，此处为nx*x+ny*y，即paf在单位向量上的投影长度
105 
106                 height_n = pafs.shape[0] // 2
107                 success_ratio = 0
108                 point_num = 10  # number of points to integration over paf  # paf上两点之间抽10个点，累计paf
109                 if cur_point_score > -100:
110                     passed_point_score = 0
111                     passed_point_num = 0
112                     x, y = linspace2d(kpt_a, kpt_b)  # 起点和终点之间插值，得到point_num个点
113                     for point_idx in range(point_num):
114                         if not demo:
115                             px = int(round(x[point_idx]))  # 四舍五入坐标
116                             py = int(round(y[point_idx]))
117                         else:
118                             px = int(x[point_idx])      # 截断坐标
119                             py = int(y[point_idx])
120                         paf = part_pafs[py, px, 0:2]  # 得到起点和终点中间抽点处paf的xy向量
121                         cur_point_score = vec[0] * paf[0] + vec[1] * paf[1]  # 该向量在起点指向终点单位向量上的投影
122                         if cur_point_score > min_paf_score:  # 投影大于阈值
123                             passed_point_score += cur_point_score  # 累计插值点score
124                             passed_point_num += 1                  # 累计插值点数量
125                     success_ratio = passed_point_num / point_num  # 插值点中大于阈值的点的数量占总插值点数量的比例
126                     ratio = 0
127                     if passed_point_num > 0:
128                         ratio = passed_point_score / passed_point_num  # 累计paf的平均值
129                     ratio += min(height_n / vec_norm - 1, 0)  # 两特征点距离较远，则惩罚paf平均值（较远左侧小于0）
130                 if ratio > 0 and success_ratio > 0.8:  # 累计paf平均值大于0,且两关节点之间插值的点大于阈值的点的比例大于阈值
131                     score_all = ratio + kpts_a[i][2] + kpts_b[j][2]  # paf+起点热图+终点热图，作为当前起点和终点是一个躯干的score
132                     connections.append([i, j, ratio, score_all])  # 当前起点和终点是一个躯干时起点在该关节点所有起点中的索引，终点在该关节点中所有终点的索引，paf均值，是一个躯干的得分
133         if len(connections) > 0:
134             connections = sorted(connections, key=itemgetter(2), reverse=True)  # 按照paf均值排序
135 
136         num_connections = min(num_kpts_a, num_kpts_b)  # 当前图像上该躯干最多的数量（起点和终点较少值）
137         has_kpt_a = np.zeros(num_kpts_a, dtype=np.int32)  # 起点被占用的flag
138         has_kpt_b = np.zeros(num_kpts_b, dtype=np.int32)  # 终点被占用的flag
139         filtered_connections = []   # 清理之后的connections：当前躯干起点在所有关节点中的索引，终点在所有关节点中的索引，paf均值
140         for row in range(len(connections)):
141             if len(filtered_connections) == num_connections:  # 已经达到最多关节点数量了，不用继续比较了
142                 break
143             i, j, cur_point_score = connections[row][0:3]  # 当前起点和终点是一个躯干时起点在该关节点所有起点中的索引，终点在该关节点中所有终点的索引，paf均值
144             if not has_kpt_a[i] and not has_kpt_b[j]:  # 起点和终点均未被占用(如果i某个起点或者某个终点被分配给了不同的躯干，因paf从大到小排序，故paf较小的忽略)
145                 filtered_connections.append([kpts_a[i][3], kpts_b[j][3], cur_point_score])  # 当前躯干起点在所有关节点中的索引，终点在所有关节点中的索引，paf均值
146                 has_kpt_a[i] = 1  # 对应起点被占用
147                 has_kpt_b[j] = 1  # 对应终点被占用
148         connections = filtered_connections  # 使用清理之后的connections，实际上score_all未使用
149         if len(connections) == 0:  # 当前无躯干，计算下一个躯干
150             continue
151 
152         if part_id == 0:  # 第一次计算躯干
153             pose_entries = [np.ones(pose_entry_size) * -1 for _ in range(len(connections))]  # 前18个为每个人各个关节点在所有关节点中的索引，最后两个分别为总分值和分配给这个人关节点的数量
154             for i in range(len(connections)):  # 依次遍历当前找到的所有该躯干
155                 pose_entries[i][BODY_PARTS_KPT_IDS[0][0]] = connections[i][0]  # 起点在所有关节点中的索引
156                 pose_entries[i][BODY_PARTS_KPT_IDS[0][1]] = connections[i][1]  # 终点在所有关节点中的索引
157                 pose_entries[i][-1] = 2  # 当前人所有关节点的数量
158                 pose_entries[i][-2] = np.sum(all_keypoints[connections[i][0:2], 2]) + connections[i][2]  # 两个关节点热图值+平均paf值
159         elif part_id == 17 or part_id == 18:  # 最后两个躯干
160             kpt_a_id = BODY_PARTS_KPT_IDS[part_id][0]   # 起点的id
161             kpt_b_id = BODY_PARTS_KPT_IDS[part_id][1]   # 终点的id
162             for i in range(len(connections)):  # 将当前躯干和part_id=0时分配的所有人依次比较。此处为当前躯干
163                 for j in range(len(pose_entries)):   # 此处为分配的所有人
164                     if pose_entries[j][kpt_a_id] == connections[i][0] and pose_entries[j][kpt_b_id] == -1:  # 当前躯干的起点和分配到的某个人的起点一致，且当前躯干的终点未分配
165                         pose_entries[j][kpt_b_id] = connections[i][1]  # 将当前躯干的终点分配到这个人对应终点上
166                     elif pose_entries[j][kpt_b_id] == connections[i][1] and pose_entries[j][kpt_a_id] == -1: # 当前躯干的终点和分配到的某个人的终点一致，且当前躯干的起点未分配
167                         pose_entries[j][kpt_a_id] = connections[i][0]  # 将当前躯干的起点分配到这个人对应起点上
168             continue
169         else:
170             kpt_a_id = BODY_PARTS_KPT_IDS[part_id][0]  # 起点的id
171             kpt_b_id = BODY_PARTS_KPT_IDS[part_id][1]  # 终点的id
172             for i in range(len(connections)):  # 将当前躯干和part_id=0时分配的所有人依次比较。此处为当前躯干
173                 num = 0
174                 for j in range(len(pose_entries)):   # 此处为分配的所有人
175                     if pose_entries[j][kpt_a_id] == connections[i][0]:  # 当前躯干的起点和分配到的某个人的起点一致
176                         pose_entries[j][kpt_b_id] = connections[i][1]  # 将当前躯干的终点分配到这个人对应终点上
177                         num += 1  # 分配的人+1
178                         pose_entries[j][-1] += 1  # 当前人所有关节点的数量+1
179                         pose_entries[j][-2] += all_keypoints[connections[i][1], 2] + connections[i][2]  # 当前人socre增加
180                 if num == 0:  # 如果没有分配到的人，则新建一个人
181                     pose_entry = np.ones(pose_entry_size) * -1
182                     pose_entry[kpt_a_id] = connections[i][0]
183                     pose_entry[kpt_b_id] = connections[i][1]
184                     pose_entry[-1] = 2
185                     pose_entry[-2] = np.sum(all_keypoints[connections[i][0:2], 2]) + connections[i][2]
186                     pose_entries.append(pose_entry)
187 
188     filtered_entries = []
189     for i in range(len(pose_entries)):  # 依次遍历所有分配的人
190         if pose_entries[i][-1] < 3 or (pose_entries[i][-2] / pose_entries[i][-1] < 0.2): # 如果当前人关节点数量少于3,或者当前人平均得分小于0.2,则删除该人
191             continue
192         filtered_entries.append(pose_entries[i])
193     pose_entries = np.asarray(filtered_entries)
194     return pose_entries, all_keypoints  # 返回所有分配的人（前18维为每个人各个关节点在所有关节点中的索引，后两唯为每个人得分及每个人关节点数量），及所有关节点信息

View Code

4.11 demo

demo中两个函数代码如下：

 1 def infer_fast(net, img, net_input_height_size, stride, upsample_ratio, cpu,
 2                pad_value=(0, 0, 0), img_mean=(128, 128, 128), img_scale=1/256):
 3     height, width, _ = img.shape   # 实际高宽
 4     scale = net_input_height_size / height   # 将实际高所放到期望高的缩放倍数
 5 
 6     scaled_img = cv2.resize(img, (0, 0), fx=scale, fy=scale, interpolation=cv2.INTER_CUBIC)  # 缩放后的图像
 7     scaled_img = normalize(scaled_img, img_mean, img_scale)  # 归一化图像
 8     min_dims = [net_input_height_size, max(scaled_img.shape[1], net_input_height_size)]
 9     padded_img, pad = pad_width(scaled_img, stride, pad_value, min_dims)  # 填充到高宽为stride整数倍的值
10 
11     tensor_img = torch.from_numpy(padded_img).permute(2, 0, 1).unsqueeze(0).float()   # 由HWC转成CHW（BGR格式）
12     if not cpu:
13         tensor_img = tensor_img.cuda()
14 
15     stages_output = net(tensor_img) # 得到网络的输出
16 
17     stage2_heatmaps = stages_output[-2]  # 最后一个stage的热图
18     heatmaps = np.transpose(stage2_heatmaps.squeeze().cpu().data.numpy(), (1, 2, 0))  # 最后一个stage的热图作为最终的热图
19     heatmaps = cv2.resize(heatmaps, (0, 0), fx=upsample_ratio, fy=upsample_ratio, interpolation=cv2.INTER_CUBIC)  # 热图放大upsample_ratio倍
20 
21     stage2_pafs = stages_output[-1]  # 最后一个stage的paf
22     pafs = np.transpose(stage2_pafs.squeeze().cpu().data.numpy(), (1, 2, 0))   # 最后一个stage的paf作为最终的paf
23     pafs = cv2.resize(pafs, (0, 0), fx=upsample_ratio, fy=upsample_ratio, interpolation=cv2.INTER_CUBIC)  # paf放大upsample_ratio倍
24 
25     return heatmaps, pafs, scale, pad  # 返回热图，paf，输入模型图像相比原始图像缩放倍数，输入模型图像padding尺寸
26 
27 
28 def run_demo(net, image_provider, height_size, cpu):
29     net = net.eval()
30     if not cpu:
31         net = net.cuda()
32 
33     stride = 8
34     upsample_ratio = 4
35     color = [0, 224, 255]
36     for img in image_provider:
37         orig_img = img.copy()
38         heatmaps, pafs, scale, pad = infer_fast(net, img, height_size, stride, upsample_ratio, cpu)  # 热图，paf，输入模型图像相比原始图像缩放倍数，输入模型图像padding尺寸
39 
40         total_keypoints_num = 0
41         all_keypoints_by_type = []  # all_keypoints_by_type为18个list，每个list包含Ni个当前点的x、y坐标，当前点热图值，当前点在所有特征点中的index
42         for kpt_idx in range(18):  # 19th for bg  第19个为背景，之考虑前18个关节点
43             total_keypoints_num += extract_keypoints(heatmaps[:, :, kpt_idx], all_keypoints_by_type, total_keypoints_num)
44 
45         pose_entries, all_keypoints = group_keypoints(all_keypoints_by_type, pafs, demo=True)  # 得到所有分配的人（前18维为每个人各个关节点在所有关节点中的索引，后两唯为每个人得分及每个人关节点数量），及所有关节点信息
46         for kpt_id in range(all_keypoints.shape[0]):  # 依次将每个关节点信息缩放回原始图像上
47             all_keypoints[kpt_id, 0] = (all_keypoints[kpt_id, 0] * stride / upsample_ratio - pad[1]) / scale
48             all_keypoints[kpt_id, 1] = (all_keypoints[kpt_id, 1] * stride / upsample_ratio - pad[0]) / scale
49         for n in range(len(pose_entries)):  # 依次遍历找到的每个人
50             if len(pose_entries[n]) == 0:
51                 continue
52             for part_id in range(len(BODY_PARTS_PAF_IDS) - 2):  # 将躯干某个连接的单位向量映射到paf对应的通道
53                 kpt_a_id = BODY_PARTS_KPT_IDS[part_id][0]   # 当前躯干起点的id
54                 global_kpt_a_id = pose_entries[n][kpt_a_id]  # 当前关节点在所有关节点中的索引
55                 if global_kpt_a_id != -1:  # 分配了当前关节点
56                     x_a, y_a = all_keypoints[int(global_kpt_a_id), 0:2]  # 当前关节点在原图像上的坐标
57                     cv2.circle(img, (int(x_a), int(y_a)), 3, color, -1)  # 原图画圆
58                 kpt_b_id = BODY_PARTS_KPT_IDS[part_id][1]   # 当前躯干终点的id
59                 global_kpt_b_id = pose_entries[n][kpt_b_id]  # 当前关节点在所有关节点中的索引
60                 if global_kpt_b_id != -1:  # 分配了当前关节点
61                     x_b, y_b = all_keypoints[int(global_kpt_b_id), 0:2]  # 当前关节点在原图像上的坐标
62                     cv2.circle(img, (int(x_b), int(y_b)), 3, color, -1)  # 原图画圆
63                 if global_kpt_a_id != -1 and global_kpt_b_id != -1: # 起点和终点均分配
64                     cv2.line(img, (int(x_a), int(y_a)), (int(x_b), int(y_b)), color, 2)  # 画连接起点和终点的直线
65 
66         img = cv2.addWeighted(orig_img, 0.6, img, 0.4, 0)  # 0.6 * orig_img + 0.4 * img
67         cv2.imwrite('res.jpg', img)

View Code

4.12 左右镜像

此处的左右镜像，指测试阶段的左右镜像。不要和4.7.5中训练阶段的Flip弄混。由于在测试阶段，已经得到了关键点和paf，因而若左右镜像图像，需要将heatmaps及pafs进行重新映射，如下表所示。另一方面，需要将paf的x坐标取负，因为paf是从起点指向终点的向量。左右镜像后，起点指向终点的向量的y分量不变，但是x分量则相反。

你可能感兴趣的:(（原）人体姿态识别Light weight openpose)

机器学习与深度学习间关系与区别 ℒℴѵℯ心·动ꦿ໊ོ꫞ 人工智能学习深度学习 python
一、机器学习概述定义机器学习（MachineLearning,ML）是一种通过数据驱动的方法，利用统计学和计算算法来训练模型，使计算机能够从数据中学习并自动进行预测或决策。机器学习通过分析大量数据样本，识别其中的模式和规律，从而对新的数据进行判断。其核心在于通过训练过程，让模型不断优化和提升其预测准确性。主要类型1.监督学习（SupervisedLearning）监督学习是指在训练数据集中包含输入
《投行人生》读书笔记小蘑菇的树洞
《投行人生》----作者詹姆斯-A-朗德摩根斯坦利副主席40年的职业洞见-很短小精悍的篇幅，比较适合初入职场的新人。第一部分成功的职业生涯需要规划1.情商归为适应能力分享与协作同理心适应能力，更多的是自我意识，你有能力识别自己的情并分辨这些情绪如何影响你的思想和行为。2.对于初入职场的人的建议，细节，截止日期和数据很重要截止日期，一种有效的方法是请老板为你所有的任务进行优先级排序。和老板喝咖啡的好
春季养肝正当时 dxn悟
重温快乐2023年2月4日立春。春天来了，春暖花开，小鸟欢唱，那在这样的季节我们如何养肝呢？自然界的春季对应中医五行的木，人体五脏肝属木，“木曰曲直”，是以树干曲曲直直地向上、向外伸长舒展的生发姿态，来形容具有生长、升发、条达、舒畅等特征的食物及现象。根据中医天人相应的理念，肝五行属木，喜条达，主疏泄，与春天相应，所以春天最适合养肝。养肝首先要少生气，因为肝喜条达恶抑郁。人体五志肝为怒，生气发怒最
人怎么才能认识自己？阿尚青子自由写作人
人怎么才能认识自己？（原问题）我从不愿意上纲上线地确定偌大的话题，就直接说吧。纵使你能认识世界上的万事万物，你很难做到真实地认识自己。因为即使就这个世界，基本上每个人也很难做到客观、公正、科学地认识。对你好的人就是好吗？一件事情是否能够保持永远原来的样子？借不到钱的男友，女友想离开他就理直气壮？父母对子女有几分慷慨，又有几分是无私？工作的意义究竟是什么？是工作需要你，还是你需要工作呢？诸如此类的问
番茄西红柿叶子病害分类数据集12882张11类别 futureflsl 数据集分类数据挖掘人工智能
数据集类型：图像分类用，不可用于目标检测无标注文件数据集格式：仅仅包含jpg图片，每个类别文件夹下面存放着对应图片图片数量(jpg文件个数)：12882分类类别数：11类别名称:["Bacterial_Spot_Bacteria","Early_Blight_Fungus","Healthy","Late_Blight_Water_Mold","Leaf_Mold_Fungus","Powdery
系统架构设计师需求分析篇二 AmHardy 软件架构设计师系统架构需求分析面向对象分析分析模型 UML和SysML
面向对象分析方法1.用例模型构建用例模型一般需要经历4个阶段：识别参与者：识别与系统交互的所有事物。合并需求获得用例：将需求分配给予其相关的参与者。细化用例描述：详细描述每个用例的功能。调整用例模型：优化用例之间的关系和结构，前三个阶段是必需的。2.用例图的三元素参与者：使用系统的用户或其他外部系统和设备。用例：系统所提供的服务。通信关联：参与者和用例之间的关系，或用例与用例之间的关系。3.识别参
JavaScript 中，深拷贝（Deep Copy）和浅拷贝（Shallow Copy）跳房子的前端前端面试 javascript 开发语言 ecmascript
在JavaScript中，深拷贝（DeepCopy）和浅拷贝（ShallowCopy）是用于复制对象或数组的两种不同方法。了解它们的区别和应用场景对于避免潜在的bugs和高效地处理数据非常重要。以下是对深拷贝和浅拷贝的详细解释，包括它们的概念、用途、优缺点以及实现方式。1.浅拷贝（ShallowCopy）概念定义：浅拷贝是指创建一个新的对象或数组，其中包含了原对象或数组的基本数据类型的值和对引用数
2023-02-12 c95bd0dd66c9
补气吃什么中成药最好，四款春季宜服的中成药春天由于阳气升发，正是“推陈出新”的时期，温暖多风，因此非常适合细菌、病毒等微生物的生存和传播，由此而引发外感热病较多，所以要吃点能补充人体正气，提高免疫力的药物，不起吃什么中成药最好呢，可选用的中成药有以下几种。1、玉屏风散是小粒丸剂，由黄芪、白术、防风诸药组成，对于血气虚弱、体表不固、易患感冒伤风者为宜。风为春天之主气，最易侵袭人体，平时服此药，能有效
人应该追求多少钱？还是追求自由，陪伴，互相依存？阿尚青子自由写作人
人应该追求多少钱？还是追求自由，陪伴，互相依存？（原问题）回答这样的问题应该有难度，因为此问题问的几个方面好像不属于同一个价值平台，而同一个价值平台的和钱几乎等同的概念又是什么呢？好像又没有什么标准答案，认同不同，问题不同，权当一个不妥帖的解释罢了。首先回答，人应该追求多少钱？看你到底对自己生活的要求和精神要求有多高了，精神追求也是需要定量金钱为支撑的，比如即使看电影，你也得花钱，就网络资源来讲你
轻风拂柳《春意萦怀》之六轻风拂柳
图/来自网络轻风拂柳《春意萦怀》之六轻风拂柳《春意萦怀》原韵烂熳芳林赏丽容，春光明媚盼相逢。娇桃绽蕊仙姿艳，淑杏凝脂玉色浓。对对黄莺穿树影，双双彩蝶逐花踪。风情小雅灵犀有，景美难将笔墨封。图/来自网络步轻风拂柳《春意萦怀》原韵（一）诗·时就三月阳春思丽容，花红柳绿也相逢。不歆桃蕊风姿艳，只慕书斋墨色浓。期翼共窗难觅影，时望携手苦寻踪。天涯海角君何有？一颗痴心哪日封？（二）诗·大漠孤烟滴翠丛林展媚容
原力元宇宙：Web3时代下的虚拟现实融合与普通人逆袭的机遇口碑信息传播者
在数字化浪潮席卷全球的今天，一个崭新的概念——原力元宇宙，正以其独特的魅力吸引着越来越多的目光。作为元宇宙国际性的一个项目，原力元宇宙不仅融合了Web3第三代互联网的前沿技术，更将虚拟现实与现实生活紧密相连，为我们描绘出一幅前所未有的数字新世界画卷。13分钟视频内容讲明白原力元宇宙创富项目，中国区运营服务对接微信：ForceZen原力元宇宙，是一个时代的跨越，它代表着互联网技术的又一次革新。Web
狼牙山人-画家张国富原创写意作品剖析第65帧《数枝浓艳对秋光啚》张国富字腴田
狼牙山人-画家张国富原创写意作品剖析第65帧《数枝浓艳对秋光啚》2016年3月原創寫意作品《數枝農艷對秋光圖》。
轻量级模型解读——轻量transformer系列 lishanlu136 #图像分类轻量级模型 transformer 图像分类
先占坑，持续更新。。。文章目录1、DeiT2、ConViT3、Mobile-Former4、MobileViTTransformer是2017谷歌提出的一篇论文，最早应用于NLP领域的机器翻译工作，Transformer解读，但随着2020年DETR和ViT的出现(DETR解读，ViT解读)，其在视觉领域的应用也如雨后春笋般渐渐出现，其特有的全局注意力机制给图像识别领域带来了重要参考。但是tran
编译Windows平台的Nginx+ngx_http_proxy_connect_module Grovvy_Deng windows nginx http
编译Windows平台的Nginx+ngx_http_proxy_connect_module背景：由于公司的正向出局代理是windows机器。机器上的Squid不稳定，打算替换成nginx+ngx_http_proxy_connect_module实现。通过几天痛苦的尝试，最后参考了github大神项目通过在线CICD工具编译window平台可用的ng。步骤：获取git可识别的patch由于CI
使用Python和Playwright破解滑动验证码 asfdsgdf python 开发语言
滑动验证码是一种常见的验证码形式，通过拖动滑块将缺失的拼图块对准原图中的空缺位置来验证用户操作。本文将介绍如何使用Python中的OpenCV进行模板匹配，并结合Playwright实现自动化破解滑动验证码的过程。所需技术OpenCV模板匹配：用于识别滑块在背景图中的正确位置。Python：主要编程语言。Playwright：用于浏览器自动化，模拟用户操作。破解过程概述获取验证码图像：下载背景图和
Linux下使用U盘 WittXie Linux linux 运维服务器
第一步：插入U盘，如果能够识别出U盘，则会打印出一些信息；第二步：查看U盘系统分配给U盘的设备名；输入如下命令进行查看：fdisk-l/dev/sda如果打印出如下信息：Disk/dev/sda:4233MB,4233101312bytes165heads,34sectors/track,1473cylindersUnits=cylindersof5610*512=2872320bytesDevi
爬虫技术抓取网站数据被限制怎么处理 Bearjumpingcandy 爬虫
爬虫技术用于抓取网站数据时，可能会遇到一些限制，常见的包括反爬机制、速率限制、IP封禁等。以下是应对这些情况的一些策略：尊重robots.txt：每个网站都有robots.txt文件，遵循其中的规定可以避免触犯网站的抓取规则。设置合理频率：控制爬虫请求的速度，通过添加延迟或使用代理服务器，减少对目标网站的压力。使用代理：获取并使用代理IP地址可以更换访问来源，降低被识别的可能性。模拟用户行为：使用
基于TRIZ的救援机器人轻量化设计天行健王春城老师 TRIZ 机器人
在救援机器人设计中，轻量化是一个至关重要的目标，它直接关系到机器人的便携性、运输效率以及在复杂环境中的作业能力。TRIZ理论为我们提供了一套系统化的工具和方法，用于解决设计过程中遇到的各种挑战，特别是在实现轻量化目标时，TRIZ能够帮助我们识别并消除设计中的冗余与低效部分，同时保留或增强其关键功能。具体如深圳天行健企业管理咨询公司下文所述：1.功能分析与矛盾识别TRIZ理论强调对系统功能的深入分析
腾讯发表多模态综述，一文详解多模态大模型存内计算开发者社区多模态大模型人工智能 chatgpt AIGC 量子计算 AI-native gpt agi
多模态大语言模型（MLLM）是近年来兴起的一个新的研究热点，它利用强大的大语言模型作为大脑来执行多模态任务。MLLM令人惊讶的新兴能力，如基于图像写故事和无OCR的数学推理，在传统方法中是罕见的，这表明了一条通往人工通用智能的潜在道路。在本文中，追踪多模态大模型最新热点，讨论多模态关键技术以及现有在情绪识别上的应用。腾讯AILab发表了一篇关于多模态大模型的最新综述《MM-LLMs:RecentA
探索创新科技： Lite-Mono - 简约高效的小型化Mono框架杭律沛Meris
探索创新科技：Lite-Mono-简约高效的小型化Mono框架Lite-Mono[CVPR2023]Lite-Mono:ALightweightCNNandTransformerArchitectureforSelf-SupervisedMonocularDepthEstimation项目地址:https://gitcode.com/gh_mirrors/li/Lite-Mono如果你在寻找一个轻
Python编写简单登录系统的完整指南 qq_35430208 python python 开发语言 Python编写简单登录系统登录系统
在现代应用中，用户认证和登录系统是一个非常重要的功能。通过登录系统，应用能够识别用户的身份，并为其提供相应的权限和服务。本文将介绍如何使用Python编写一个简单的登录系统，包括用户注册、登录验证、密码加密等功能。通过这一教程，将学习如何构建一个基本的用户登录系统，并理解其中的关键技术。系统需求分析一个基本的登录系统应该具备以下功能：用户注册：新用户可以创建账号，系统会将用户名和密码存储起来。登录
ComfyUI AnimateDiff-Lightning 教程 jayli517 ComfyUI AIGC
介绍项目主页：https://huggingface.co/ByteDance/AnimateDiff-Lightning在线测试（有墙）：https://huggingface.co/spaces/ByteDance/AnimateDiff-Lightning国内镜像：https://hf-mirror.com/ByteDance/AnimateDiff-LightningAnimateDiff
python-pcl函数_Python简介，第4章-函数 cumei1658 java webgl python lua ios
python-pcl函数Runningthroughthedoor,Baldricfoundhimselfinanenormouscavern,itsceilinglostinshadow.Greatcolumnsofblackstonesoaredfromtheground,andpoolsoflavabubbledthroughout,lightingthecaverninadarkred.T
⭐算法入门⭐《归并排序》简单01 —— LeetCode 21. 合并两个有序链表英雄哪里出来《LeetCode算法全集》算法数据结构链表 c++归并排序
饭不食，水不饮，题必须刷C语言免费动漫教程，和我一起打卡！《光天化日学C语言》LeetCode太难？先看简单题！《C语言入门100例》数据结构难？不存在的！《数据结构入门》LeetCode太简单？算法学起来！《夜深人静写算法》文章目录一、题目1、题目描述2、基础框架3、原题链接二、解题报告1、思路分析2、时间复杂度3、代码详解三、本题小知识一、题目1、题目描述将两个不降序链表合并为一个新的不降
遥感图像分割系统：融合空间金字塔池化（FocalModulation)改进YOLOv8 xuehaisj YOLO 人工智能计算机视觉 yolov8
1.研究背景与意义项目参考AAAIAssociationfortheAdvancementofArtificialIntelligence研究背景与意义遥感图像分割是遥感技术领域中的一个重要研究方向，它的目标是将遥感图像中的不同地物或地物类别进行有效的分割和识别。随着遥感技术的不断发展和遥感图像数据的大规模获取，遥感图像分割在农业、城市规划、环境监测等领域具有广泛的应用前景。然而，由于遥感图像的特
写作只是业余兴趣爱好，不为挣钱！简明估
在我写作这一年多里，有很多编辑找我写作有偿文章，我都拒绝了！其实我写文章，只是个人爱好，纯属娱乐，不为挣钱，更不为打赏，所以我只愿意在里面写写文章，图的是一个没有压力，开心，自在的生活！我觉得人世间的那份爱好，如果变成了挣钱的工具，那么就失去了原味了，每天强迫自己去日更新，影响身体健康，影响睡眠，一个快乐变成了压力，真的是得不偿失！人生难得自由在，荣华富贵莫强求，君子之才发与心，不与利益为目的！简
python图像匹配_opencvpython中的图像匹配 weixin_39585675 python图像匹配
我一直在做一个项目，用opencvpython识别相机中显示的标志。我已经尝试过使用surf、颜色直方图匹配和模板匹配。但在这3个问题中，它并不总是返回正确的答案。我现在想要的是，解决我这个问题的最好办法是什么。模板图像示例：以下是摄像头中显示的标志示例。如果这是我想要识别的图像，该怎么用？在更新matchTemplate中的代码flags=["Cambodia.jpg","Laos.jpg","
2022-05-08 浩游
你还不知道“被动房”是什么吗？它不用装空调、不必安装暖气，四季的室温都能保持室内相对温度20—26摄氏度，还能比普通住房节省90%以上的能源？听上去是不是非常神奇？这种神奇的房子就叫做“被动房”。想必大家的房子都装有暖气、空调之类的调节气温的设备，这些设备能够帮助我们在室内获得一个适宜人体生活的温度，但你有没有想过它们是否真的有必要？你知道“被动房”是什么吗？它是早在1988年，瑞典隆德大学的阿达
懒人油泼面，治愈一切惠顾星辰
图片发自App说起油泼面，自然会想到电视剧《白鹿原》里的一个场景，一个和主线剧情关系不大的小细节，这个细节的主角就是一碗热气腾腾的油泼面。剧情中，秦海璐下厨为张嘉译做油泼面，做面的过程应该是正宗的陕西古法，面宽油香，热油泼在面上的一刹那，随着“刺啦”的一声，香气仿佛冲破屏幕，萦绕在观众身旁，让观众们纷纷在弹幕中留下：“油泼面来了，流口水啊”、“看饿了想吃”等留言。一部《白鹿原》，连带着火了陕西油泼
两种方法判断Python的位数是32位还是64位 sanqima Python编程电脑 python 开发语言
Python从1991年发布以来，凭借其简洁、清晰、易读的语法、丰富的标准库和第三方工具，在Web开发、自动化测试、人工智能、图形识别、机器学习等领域发展迅猛。 Python是一种胶水语言，通过Cython库与C/C++语言进行链接，通过Jython库与Java语言进行链接。 Python是跨平台的，可运行在多种操作系统上，包括但不限于Windows、Linux和macOS。这意味着用Py
Maven Array_06 eclipse jdk maven
Maven Maven是基于项目对象模型(POM)，信息来管理项目的构建，报告和文档的软件项目管理工具。 Maven 除了以程序构建能力为特色之外，还提供高级项目管理工具。由于 Maven 的缺省构建规则有较高的可重用性，所以常常用两三行 Maven 构建脚本就可以构建简单的项目。由于 Maven 的面向项目的方法，许多 Apache Jakarta 项目发文时使用 Maven，而且公司
ibatis的queyrForList和queryForMap区别 bijian1013 java ibatis
一.说明 iBatis的返回值参数类型也有种：resultMap与resultClass，这两种类型的选择可以用两句话说明之： 1.当结果集列名和类的属性名完全相对应的时候，则可直接用resultClass直接指定查询结果类
LeetCode[位运算] - #191 计算汉明权重 Cwind java 位运算 LeetCode Algorithm 题解
原题链接：#191 Number of 1 Bits 要求：写一个函数，以一个无符号整数为参数，返回其汉明权重。例如，‘11’的二进制表示为'00000000000000000000000000001011', 故函数应当返回3。汉明权重：指一个字符串中非零字符的个数；对于二进制串，即其中‘1’的个数。难度：简单分析：将十进制参数转换为二进制，然后计算其中1的个数即可。 “
浅谈java类与对象 15700786134 java
java是一门面向对象的编程语言，类与对象是其最基本的概念。所谓对象，就是一个个具体的物体，一个人，一台电脑，都是对象。而类，就是对象的一种抽象，是多个对象具有的共性的一种集合，其中包含了属性与方法，就是属于该类的对象所具有的共性。当一个类创建了对象，这个对象就拥有了该类全部的属性，方法。相比于结构化的编程思路，面向对象更适用于人的思维
linux下双网卡同一个IP 被触发 linux
转自： http://q2482696735.blog.163.com/blog/static/250606077201569029441/ 由于需要一台机器有两个网卡，开始时设置在同一个网段的IP，发现数据总是从一个网卡发出，而另一个网卡上没有数据流动。网上找了下，发现相同的问题不少：一、关于双网卡设置同一网段IP然后连接交换机的时候出现的奇怪现象。当时没有怎么思考、以为是生成树
安卓按主页键隐藏程序之后无法再次打开肆无忌惮_ 安卓
遇到一个奇怪的问题，当SplashActivity跳转到MainActivity之后，按主页键，再去打开程序，程序没法再打开（闪一下），结束任务再开也是这样，只能卸载了再重装。而且每次在Log里都打印了这句话"进入主程序"。后来发现是必须跳转之后再finish掉SplashActivity 本来代码： // 销毁这个Activity fin
通过cookie保存并读取用户登录信息实例知了ing JavaScript html
通过cookie的getCookies()方法可获取所有cookie对象的集合；通过getName()方法可以获取指定的名称的cookie；通过getValue()方法获取到cookie对象的值。另外，将一个cookie对象发送到客户端，使用response对象的addCookie()方法。下面通过cookie保存并读取用户登录信息的例子加深一下理解。（1）创建index.jsp文件。在改
JAVA 对象池矮蛋蛋 java ObjectPool
原文地址： http://www.blogjava.net/baoyaer/articles/218460.html Jakarta对象池 ☆为什么使用对象池恰当地使用对象池化技术，可以有效地减少对象生成和初始化时的消耗，提高系统的运行效率。Jakarta Commons Pool组件提供了一整套用于实现对象池化
ArrayList根据条件+for循环批量删除的方法 alleni123 java
场景如下： ArrayList<Obj> list Obj-> createTime, sid. 现在要根据obj的createTime来进行定期清理。（释放内存） ------------------------- 首先想到的方法就是 for(Obj o:list){ if(o.createTime-currentT>xxx){
阿里巴巴“耕地宝”大战各种宝百合不是茶平台战略
“耕地保”平台是阿里巴巴和安徽农民共同推出的一个 “首个互联网定制私人农场”，“耕地宝”由阿里巴巴投入一亿，主要是用来进行农业方面，将农民手中的散地集中起来不仅加大农民集体在土地上面的话语权，还增加了土地的流通与利用率，提高了土地的产量，有利于大规模的产业化的高科技农业的发展，阿里在农业上的探索将会引起新一轮的产业调整，但是集体化之后农民的个体的话语权将更少，国家应出台相应的法律法规保护
Spring注入有继承关系的类（1） bijian1013 java spring
一个类一个类的注入 1.AClass类 package com.bijian.spring.test2; public class AClass { String a; String b; public String getA() { return a; } public void setA(Strin
30岁转型期你能否成为成功人士 bijian1013 成功
很多人由于年轻时走了弯路，到了30岁一事无成，这样的例子大有人在。但同样也有一些人，整个职业生涯都发展得很优秀，到了30岁已经成为职场的精英阶层。由于做猎头的原因，我们接触很多30岁左右的经理人，发现他们在职业发展道路上往往有很多致命的问题。在30岁之前，他们的职业生涯表现很优秀，但从30岁到40岁这一段，很多人
[Velocity三]基于Servlet+Velocity的web应用 bit1129 velocity
什么是VelocityViewServlet 使用org.apache.velocity.tools.view.VelocityViewServlet可以将Velocity集成到基于Servlet的web应用中，以Servlet+Velocity的方式实现web应用 Servlet + Velocity的一般步骤 1.自定义Servlet，实现VelocityViewServl
【Kafka十二】关于Kafka是一个Commit Log Service bit1129 service
Kafka is a distributed, partitioned, replicated commit log service.这里的commit log如何理解？ A message is considered "committed" when all in sync replicas for that partition have applied i
NGINX + LUA实现复杂的控制 ronin47 lua nginx 控制
安装lua_nginx_module 模块 lua_nginx_module 可以一步步的安装，也可以直接用淘宝的OpenResty Centos和debian的安装就简单了。。这里说下freebsd的安装： fetch http://www.lua.org/ftp/lua-5.1.4.tar.gz tar zxvf lua-5.1.4.tar.gz cd lua-5.1.4 ma
java-14.输入一个已经按升序排序过的数组和一个数字，在数组中查找两个数，使得它们的和正好是输入的那个数字 bylijinnan java
public class TwoElementEqualSum { /** * 第 14 题：题目：输入一个已经按升序排序过的数组和一个数字，在数组中查找两个数，使得它们的和正好是输入的那个数字。要求时间复杂度是 O(n) 。如果有多对数字的和等于输入的数字，输出任意一对即可。例如输入数组 1 、 2 、 4 、 7 、 11 、 15 和数字 15 。由于
Netty源码学习-HttpChunkAggregator-HttpRequestEncoder-HttpResponseDecoder bylijinnan java netty
今天看Netty如何实现一个Http Server org.jboss.netty.example.http.file.HttpStaticFileServerPipelineFactory： pipeline.addLast("decoder", new HttpRequestDecoder()); pipeline.addLast(&quo
java敏感词过虑-基于多叉树原理 cngolon 违禁词过虑替换违禁词敏感词过虑多叉树
基于多叉树的敏感词、关键词过滤的工具包，用于java中的敏感词过滤 1、工具包自带敏感词词库，第一次调用时读入词库，故第一次调用时间可能较长，在类加载后普通pc机上html过滤5000字在80毫秒左右，纯文本35毫秒左右。 2、如需自定义词库，将jar包考入WEB-INF工程的lib目录，在WEB-INF/classes目录下建一个 utf-8的words.dict文本文件，
多线程知识 cuishikuan 多线程
T1，T2，T3三个线程工作顺序，按照T1，T2，T3依次进行 public class T1 implements Runnable{ @Override
spring整合activemq dalan_123 java spring jms
整合spring和activemq需要搞清楚如下的东东1、ConnectionFactory分： a、spring管理连接到activemq服务器的管理ConnectionFactory也即是所谓产生到jms服务器的链接 b、真正产生到JMS服务器链接的ConnectionFactory还得
MySQL时间字段究竟使用INT还是DateTime？ dcj3sjt126com mysql
环境：Windows XPPHP Version 5.2.9MySQL Server 5.1 第一步、创建一个表date_test（非定长、int时间） CREATE TABLE `test`.`date_test` (`id` INT NOT NULL AUTO_INCREMENT ,`start_time` INT NOT NULL ,`some_content`
Parcel: unable to marshal value dcj3sjt126com marshal
在两个activity直接传递List<xxInfo>时，出现Parcel: unable to marshal value异常。在MainActivity页面（MainActivity页面向NextActivity页面传递一个List<xxInfo>）： Intent intent = new Intent(this, Next
linux进程的查看上（ps） eksliang linux ps linux ps -l linux ps aux
ps:将某个时间点的进程运行情况选取下来转载请出自出处：http://eksliang.iteye.com/admin/blogs/2119469 http://eksliang.iteye.com ps 这个命令的man page 不是很好查阅，因为很多不同的Unix都使用这儿ps来查阅进程的状态，为了要符合不同版本的需求，所以这个
为什么第三方应用能早于System的app启动 gqdy365 System
Android应用的启动顺序网上有一大堆资料可以查阅了，这里就不细述了，这里不阐述ROM启动还有bootloader，软件启动的大致流程应该是启动kernel -> 运行servicemanager 把一些native的服务用命令启动起来（包括wifi, power, rild, surfaceflinger, mediaserver等等）-> 启动Dalivk中的第一个进程Zygot
App Framework发送JSONP请求(3) hw1287789687 jsonp 跨域请求发送jsonp ajax请求越狱请求
App Framework 中如何发送JSONP请求呢? 使用jsonp,详情请参考:http://json-p.org/ 如何发送Ajax请求呢? (1)登录 /*** * 会员登录 * @param username * @param password */ var user_login=function(username,password){ // aler
发福利，整理了一份关于“资源汇总”的汇总 justjavac 资源
觉得有用的话，可以去github关注：https://github.com/justjavac/awesome-awesomeness-zh_CN 通用 free-programming-books-zh_CN 免费的计算机编程类中文书籍精彩博客集合 hacke2/hacke2.github.io#2 ResumeSample 程序员简历
用 Java 技术创建 RESTful Web 服务 macroli java 编程 Web REST
转载：http://www.ibm.com/developerworks/cn/web/wa-jaxrs/ JAX-RS (JSR-311) 【 Java API for RESTful Web Services 】是一种 Java™ API，可使 Java Restful 服务的开发变得迅速而轻松。这个 API 提供了一种基于注释的模型来描述分布式资源。注释被用来提供资源的位
CentOS6.5-x86_64位下oracle11g的安装详细步骤及注意事项超声波 oracle linux
前言：这两天项目要上线了，由我负责往服务器部署整个项目，因此首先要往服务器安装oracle，服务器本身是CentOS6.5的64位系统，安装的数据库版本是11g，在整个的安装过程中碰到很多的坑，不过最后还是通过各种途径解决并成功装上了。转别写篇博客来记录完整的安装过程以及在整个过程中的注意事项。希望对以后那些刚刚接触的菜鸟们能起到一定的帮助作用。安装过程中可能遇到的问题（注
HttpClient 4.3 设置keeplive 和 timeout 的方法 supben httpclient
ConnectionKeepAliveStrategy kaStrategy = new DefaultConnectionKeepAliveStrategy() { @Override public long getKeepAliveDuration(HttpResponse response, HttpContext context) { long keepAlive
Spring 4.2新特性-@Import注解的升级 wiselyman spring 4
3.1 @Import @Import注解在4.2之前只支持导入配置类在4.2,@Import注解支持导入普通的java类,并将其声明成一个bean 3.2 示例演示java类 package com.wisely.spring4_2.imp; public class DemoService { public void doSomethin