梦醒时分1218

SA-SSD 代码阅读

文章目录

一. SingleStageDetector

1. 初始化
2. 前向传递

二. Auxiliary Network

1. 生成 label
2. loss 构建

三. SSDRotateHead

1. 前向传递
2. loss 构建
3. 生成 label

关于网络的细节也可以看这篇博客，作者介绍的很详细：

小白科研笔记：简析CVPR2020论文SA-SSD的网络搭建细节

一. SingleStageDetector

这个是 SA-SSD 的整体网络，由这几个部分组成：

backbone
neck
head
extra-head

在之后会详细分析每个部分，先来看一下整体的网络：(先看一下有哪些函数，具体的函数内容先省去了)

class SingleStageDetector(BaseDetector, RPNTestMixin, BBoxTestMixin,
                       MaskTestMixin):

    def __init__(self,
                 backbone,
                 neck=None,
                 bbox_head=None,
                 extra_head=None,
                 train_cfg=None,
                 test_cfg=None,
                 pretrained=None):
        super(SingleStageDetector, self).__init__()

        self.backbone = builder.build_backbone(backbone)

        if neck is not None:
            self.neck = builder.build_neck(neck)
        else:
            raise NotImplementedError

        if bbox_head is not None:
            self.rpn_head = builder.build_single_stage_head(bbox_head)

        if extra_head is not None:
            self.extra_head = builder.build_single_stage_head(extra_head)

        self.train_cfg = train_cfg
        self.test_cfg = test_cfg

        self.init_weights(pretrained)

    @property
    def with_rpn(self):
        return hasattr(self, 'rpn_head') and self.rpn_head is not None

    def init_weights(self, pretrained=None):
        if isinstance(pretrained, str):
            logger = logging.getLogger()
            load_checkpoint(self, pretrained, strict=False, logger=logger)

    def merge_second_batch(self, batch_args):

        return ret

    def forward_train(self, img, img_meta, **kwargs):

        return losses

    def forward_test(self, img, img_meta, **kwargs):

        return results

1. 初始化

代码分析：

    def __init__(self,
                 backbone,
                 neck=None,
                 bbox_head=None,
                 extra_head=None,
                 train_cfg=None,
                 test_cfg=None,
                 pretrained=None):
        super(SingleStageDetector, self).__init__()

        # 初始化 Backbone
        self.backbone = builder.build_backbone(backbone)
		
		# 初始化 neck
        if neck is not None:
            self.neck = builder.build_neck(neck)
        else:
            raise NotImplementedError
            
		# 初始化 head
        if bbox_head is not None:
            self.rpn_head = builder.build_single_stage_head(bbox_head)
		
		# 初始化 extra-head
        if extra_head is not None:
            self.extra_head = builder.build_single_stage_head(extra_head)
		
		# 传入 cfg 中的参数
        self.train_cfg = train_cfg
        self.test_cfg = test_cfg
		
		# 初始化权重
        self.init_weights(pretrained)

初始化部分都是一样的，点进去这些函数，就会发现其实都是通过 cfg 文件中的配置分别初始化这些部分，最后都会进到这个 obj_from_dict 函数。

# 根据字典型变量info去指定初始化一个parrent类对象
# 说白了，就是字典型变量中储存了类的初始化变量。核心调用是getattr
# 总之，obj_from_dict是一种做指定初始化的功能函数
def obj_from_dict(info, parent=None, default_args=None):
    """Initialize an object from dict.

    The dict must contain the key "type", which indicates the object type, it
    can be either a string or type, such as "list" or ``list``. Remaining
    fields are treated as the arguments for constructing the object.

    Args:
        info (dict): Object types and arguments.
        parent (:class:`module`): Module which may containing expected object
            classes.
        default_args (dict, optional): Default arguments for initializing the
            object.

    Returns:
        any type: Object built from the dict.
    """
    # 首先，判断info是不是字典，而且里面必须包含type关键字
    # 默认参数也要检查是字典或者为None
    assert isinstance(info, dict) and 'type' in info
    assert isinstance(default_args, dict) or default_args is None

    args = info.copy()
    obj_type = args.pop('type')
    if mmcv.is_str(obj_type):
        if parent is not None:
            obj_type = getattr(parent, obj_type)
        else:
            obj_type = sys.modules[obj_type]
    elif not isinstance(obj_type, type):
        raise TypeError('type must be a str or valid type, but '
                        f'got {type(obj_type)}')
    if default_args is not None:
        for name, value in default_args.items():
            args.setdefault(name, value)
    return obj_type(**args) # 传入arg里面的参数 相当于实例化了这个类

刚开始看这个函数没整明白，细细看了一下，起始就是根据 cfg 中设置，找到所要初始化的类，然后再传进去 cfg 中的参数，举个栗子：

neck=dict(
    type='SpMiddleFHD',
    output_shape=[40, 1600, 1408],
    num_input_features=4,
    num_hidden_features=64 * 5,
),

这是初始化 neck ，cfg 文件中的配置，首先根据 type='SpMiddleFHD' 找到 SpMiddleFHD 这个类，然后再根据 cfg 中的参数实例化这个类。此时

return obj_type(**args)

就相当于：

return SpMiddleFHD(output_shape=[40, 1600, 1408], num_input_features=4, num_hidden_features=64 * 5)

ok, 其他的部分的初始化以此类推，都是这么实现的。应该本身代码是基于 mmdetection 实现的，然后 mmdetection 中就是这么实现的，恩，看懂了就行，以后自己再写代码的时候，也可以这么写，也很方便简洁。

2. 前向传递

然后看一下前向传递的函数：注释也在代码里面了

# img.shape [B, 3, 384, 1248]
# img_meta: dict
#          img_meta[0]:
#                      img_shape : tuple (375, 1242, 3)
#                      sample_idx
#                      calib
# kwargs:
#       1. anchors           list: len(anchors)      = B
#       2. voxels            list: len(voxels)       = B
#       3. coordinates       list: len(coordinates)  = B
#       4. num_points        list: len(num_points)   = B
#       5. anchor_mask       list: len(anchor_mask)  = B
#       6. gt_labels         list: len(gt_labels)    = B
#       7. gt_bboxes         list: len(gt_bboxes)    = B

def forward_train(self, img, img_meta, **kwargs):

    # --------------------------------------------------------------------------
    # from mmdet.datasets.kitti_utils import draw_lidar
    # f = draw_lidar(kwargs["voxels"][0].cpu().numpy(), show=True) # 显示 所有点云
    # --------------------------------------------------------------------------

    batch_size = len(img_meta) # B

    ret = self.merge_second_batch(kwargs)

    # vx  就是 ret['voxels']
    vx = self.backbone(ret['voxels'], ret['num_points'])

    # x.shape     = [2, 256, 200, 176]
    # conv6.shape = [2, 256, 200, 176]
    # point_misc  : tuple, shape = 3
    #             : 1. point_mean : shape [N,4] , [:,0] 是 Batch number
    #             : 2. point_cls  : shape [N,1]
    #             : 3. point_reg  : shape [N.3]
    (x, conv6), point_misc = self.neck(vx, ret['coordinates'], batch_size)

    losses = dict()

    aux_loss = self.neck.aux_loss(*point_misc, gt_bboxes=ret['gt_bboxes'])
    losses.update(aux_loss)

    # RPN forward and loss
    if self.with_rpn:

        # rpn_outs    : tuple, size = 3
        #             : 1. box_preds      : shape [N, 200, 176, 14]
        #             : 2. cls_preds      : shape [N, 200, 176,  2]
        #             : 3. dir_cls_preds  : shape [N, 200, 176,  4]
        rpn_outs = self.rpn_head(x)

        # rpn_outs    : tuple, shape = 8
        rpn_loss_inputs = rpn_outs + (ret['gt_bboxes'], ret['gt_labels'], ret['anchors'], ret['anchors_mask'], self.train_cfg.rpn)

        rpn_losses = self.rpn_head.loss(*rpn_loss_inputs)

        losses.update(rpn_losses)

        # guided_anchors.shape :
        #                        [num_of_guided_anchors, 7]
        #                      + [num_of_gt_bboxes,      7]
        #                      ----------------------------
        #                      = [all_num,               7]
        guided_anchors = self.rpn_head.get_guided_anchors(*rpn_outs, ret['anchors'], ret['anchors_mask'], ret['gt_bboxes'], thr=0.1)
    else:
        raise NotImplementedError

    # bbox head forward and loss
    if self.extra_head:
        bbox_score = self.extra_head(conv6, guided_anchors)
        refine_loss_inputs = (bbox_score, ret['gt_bboxes'], ret['gt_labels'], guided_anchors, self.train_cfg.extra)
        refine_losses = self.extra_head.loss(*refine_loss_inputs)
        losses.update(refine_losses)

    return losses

首先传进来的参数会经过 merge_second_batch() 这个函数，看一下:

def merge_second_batch(self, batch_args):
      ret = {}
      for key, elems in batch_args.items():
          if key in [
              'voxels', 'num_points',
          ]:
              ret[key] = torch.cat(elems, dim=0)
          elif key == 'coordinates':
              coors = []
              for i, coor in enumerate(elems): # coor.shape : torch.Size([19480, 3])
                  coor_pad = F.pad(
                      coor, [1, 0, 0, 0],
                      mode='constant',
                      value=i)                # 理解 https://blog.csdn.net/jorg_zhao/article/details/105295686
                  coors.append(coor_pad)
              ret[key] = torch.cat(coors, dim=0)
          elif key in [
              'img_meta', 'gt_labels', 'gt_bboxes',
          ]:
              ret[key] = elems
          else:
              ret[key] = torch.stack(elems, dim=0)
      return ret

主要就是根据 key 把 batch 合并了，这个没什么问题，注意有这么一步：

coor_pad = F.pad( 
           coor, [1, 0, 0, 0],
           mode='constant',
           value=i)               
coors.append(coor_pad)

这里 F.pad 的用法见： F.pad

目的就是给 coordinates 多加一个维度 (eg: i = 0，1， …)，来保存 Batch

然后就是构建 loss 了，总共由三部分组成：

$loss\_all =aug\_loss + rpn\_loss + extra\_head\_loss$

之后每部分 loss 的具体组成在后面也会具体分析。

二. Auxiliary Network

1. 生成 label

在 Auxiliary Network 中，需要分割出前景点和背景点，首先需要生成前景点和背景点的 label

def pts_in_boxes3d(pts, boxes3d):
    N = len(pts) 
    M = len(boxes3d)
    pts_in_flag = torch.IntTensor(M, N).fill_(0)
    reg_target = torch.FloatTensor(N, 3).fill_(0)
    points_op_cpu.pts_in_boxes3d(pts.contiguous(), boxes3d.contiguous(), pts_in_flag, reg_target)
    return pts_in_flag, reg_target

其中：

pts_in_flag : [M, N] , pts 在 bbox 中，则 mask = 1

疑惑：reg_target : [N, 3], 值是什么？又是怎么得到的？

需要解决上面一个疑惑，就需要弄懂这个函数 points_op_cpu.pts_in_boxes3d 。这个函数在 mmdet / ops / points_op / src / points_op.cpp 中，来看一下：

int pts_in_boxes3d_cpu(at::Tensor pts, at::Tensor boxes3d, at::Tensor pts_flag, at::Tensor reg_target){
    // param pts: (N, 3)
    // param boxes3d: (M, 7)  [x, y, z, h, w, l, ry]
    // param pts_flag: (M, N)
    // param reg_target: (N, 3), center offsets

    CHECK_CONTIGUOUS(pts_flag);
    CHECK_CONTIGUOUS(pts);
    CHECK_CONTIGUOUS(boxes3d);
    CHECK_CONTIGUOUS(reg_target);

    long boxes_num = boxes3d.size(0);
    long pts_num = pts.size(0);

    int * pts_flag_flat = pts_flag.data<int>();
    float * pts_flat = pts.data<float>();
    float * boxes3d_flat = boxes3d.data<float>();
    float * reg_target_flat = reg_target.data<float>();

    // memset(assign_idx_flat, -1, boxes_num * pts_num * sizeof(int));
    // memset(reg_target_flat, 0, pts_num * sizeof(float));
    
	// 这里相当于把 tensor 给展开了遍历 (或者说铺平了？更好理解。懂就好)  
	
    int i, j, cur_in_flag;
    for (i = 0; i < boxes_num; i++){
        for (j = 0; j < pts_num; j++){
            cur_in_flag = pt_in_box3d_cpu(pts_flat[j * 3], pts_flat[j * 3 + 1], pts_flat[j * 3 + 2], boxes3d_flat[i * 7],
                                          boxes3d_flat[i * 7 + 1], boxes3d_flat[i * 7 + 2], boxes3d_flat[i * 7 + 3],
                                          boxes3d_flat[i * 7 + 4], boxes3d_flat[i * 7 + 5], boxes3d_flat[i * 7 + 6]);
            pts_flag_flat[i * pts_num + j] = cur_in_flag;
            if(cur_in_flag==1){
                reg_target_flat[j*3] = pts_flat[j*3] - boxes3d_flat[i*7];
                reg_target_flat[j*3+1] = pts_flat[j*3+1] - boxes3d_flat[i*7+1];
                reg_target_flat[j*3+2] = pts_flat[j*3+2] - (boxes3d_flat[i*7+2] + boxes3d_flat[i*7+3] / 2.0);
            }
        }
    }
    return 1;
}

其实已经可以大致理解这个函数在干啥了，通过两层循环遍历，判断点云中的所有点是否在所给定的 bbox 中，如果在 bbox 中，那就将 该点的值 - bbox 中心点的值 ，就是 reg_target, 用公式表示就是：

$reg\_target =P_{i}(x, y, z) -P_{center}(x,y,z)$

ok，上面的疑问也解开了

2. loss 构建

三. SSDRotateHead

这部分是整个网络的 head 部分，先简单列出来，然后来具体分析一下。

class SSDRotateHead(nn.Module):

    def __init__(self,
                 num_class=1,
                 num_output_filters=768,
                 num_anchor_per_loc=2,
                 use_sigmoid_cls=True,
                 encode_rad_error_by_sin=True,
                 use_direction_classifier=True,
                 box_coder='GroundBox3dCoder',
                 box_code_size=7,
                 ):
        super(SSDRotateHead, self).__init__()
        self._num_class = num_class
        self._num_anchor_per_loc = num_anchor_per_loc
        self._use_direction_classifier = use_direction_classifier
        self._use_sigmoid_cls = use_sigmoid_cls
        self._encode_rad_error_by_sin = encode_rad_error_by_sin
        self._use_direction_classifier = use_direction_classifier
        self._box_coder = getattr(boxCoders, box_coder)()
        self._box_code_size = box_code_size
        self._num_output_filters = num_output_filters

        if use_sigmoid_cls: # True
            num_cls = num_anchor_per_loc * num_class # 2 * 1
        else:
            num_cls = num_anchor_per_loc * (num_class + 1)

        self.conv_cls = nn.Conv2d(num_output_filters, num_cls, 1)
        self.conv_box = nn.Conv2d(
            num_output_filters, num_anchor_per_loc * box_code_size, 1)
        if use_direction_classifier:
            self.conv_dir_cls = nn.Conv2d(
                num_output_filters, num_anchor_per_loc * 2, 1)

    def add_sin_difference(self, boxes1, boxes2):
    
    def get_direction_target(self, anchors, reg_targets, use_one_hot=True):
    
    def prepare_loss_weights(self, labels,
                             pos_cls_weight=1.0,
                             neg_cls_weight=1.0,
                             loss_norm_type='NormByNumPositives',
                             dtype=torch.float32):
                             
    def create_loss(self,
                    box_preds,                        # torch.Size([2, 200, 176, 14])
                    cls_preds,                        # torch.Size([2, 200, 176, 2])
                    cls_targets,                      # torch.Size([2, 70400])
                    cls_weights,                      # torch.Size([2, 70400])
                    reg_targets,                      # torch.Size([2, 70400, 7])
                    reg_weights,                      # torch.Size([2, 70400])
                    num_class,                        # 1
                    use_sigmoid_cls=True,             # True
                    encode_rad_error_by_sin=True,     # True
                    box_code_size=7):                 # 7

    def forward(self, x):               
    	
    def get_guided_anchors(self, box_preds, cls_preds, dir_cls_preds, anchors, anchors_mask, gt_bboxes, thr=.1):

1. 前向传递

首先看一下前向传递 forward 函数 :

def forward(self, x):               # torch.Size([2, 256, 200, 176])

    box_preds = self.conv_box(x)    
    cls_preds = self.conv_cls(x)    
    # [N, C, y(H), x(W)]
    
    box_preds = box_preds.permute(0, 2, 3, 1).contiguous()             # torch.Size([2, 200, 176, 14])
    cls_preds = cls_preds.permute(0, 2, 3, 1).contiguous()             # torch.Size([2, 200, 176, 2])

    if self._use_direction_classifier:
        dir_cls_preds = self.conv_dir_cls(x)
        dir_cls_preds = dir_cls_preds.permute(0, 2, 3, 1).contiguous()  # torch.Size([2, 200, 176, 4])

    return box_preds, cls_preds, dir_cls_preds

输入就是经过 backbone 得到的 feature map ，然后分成两支，分别预测bbox和物体的类别。

2. loss 构建

看一下，loss 是怎么构建的：

# input
# box_preds   : torch.Size([2, 200, 176, 14])
# cls_preds   : torch.Size([2, 200, 176, 2])
# gt_bboxes   : list:len(gt_bboxes) = B , gt_bboxes[0].shape = torch.Size([num_of_gt_bboxes, 7])
# anchor      : torch.Size([2, 70400, 7])
# anchor_mask : torch.Size([2, 70400])
# cfg         : from car_cfg.py / train_cfg
def loss(self, box_preds, cls_preds, dir_cls_preds, gt_bboxes, gt_labels, anchors, anchors_mask, cfg):

    batch_size = box_preds.shape[0]

    # ADD----------------------------------------------------------------------------------------------
    add_for_test = False
    add_for_pkl  = False
	
	# for show gt_bboxes
    if add_for_test == True:
        bbox3d_for_test = gt_bboxes[0].cpu().numpy()
        draw_gt_boxes3d_for_test(center_to_corner_box3d(bbox3d_for_test), draw_text=True, show=True)
	
	# for vis anchor
    if add_for_pkl == True:
        pkl_data = {}
        pkl_data['anchors'] = anchors
        pkl_data['anchors_mask'] = anchors_mask

        import pickle
        with open("/home/seivl/pkl_data.pkl", 'wb') as fo:
            pickle.dump(pkl_data, fo)
    #-----------------------------------------------------------------------------------------------

    # 第一个 create_target_torch 是函数
    # 后面变量相当于传参数 进这个函数
    # targets 是 reg 的 target 
    labels, targets, ious = multi_apply(create_target_torch,
                                        anchors, gt_bboxes,
                                        anchors_mask, gt_labels,
                                        similarity_fn=getattr(iou3d_utils, cfg.assigner.similarity_fn)(),
                                        box_encoding_fn = second_box_encode,
                                        matched_threshold=cfg.assigner.pos_iou_thr,
                                        unmatched_threshold=cfg.assigner.neg_iou_thr,
                                        box_code_size=self._box_code_size)


    labels = torch.stack(labels,)
    targets = torch.stack(targets)

	# 生成 cls 和 reg 的权重
    cls_weights, reg_weights, cared = self.prepare_loss_weights(labels)
	
	# 生成 cls 的 target
    cls_targets = labels * cared.type_as(labels)

	# 构建 loss 
	# 具体解析见下
    loc_loss, cls_loss = self.create_loss(
        box_preds=box_preds,
        cls_preds=cls_preds,
        cls_targets=cls_targets,
        cls_weights=cls_weights,
        reg_targets=targets,
        reg_weights=reg_weights,
        num_class=self._num_class,
        encode_rad_error_by_sin=self._encode_rad_error_by_sin,
        use_sigmoid_cls=self._use_sigmoid_cls,
        box_code_size=self._box_code_size,
    )

    loc_loss_reduced = loc_loss / batch_size
    loc_loss_reduced *= 2                     # loc_loss 的权重

    cls_loss_reduced = cls_loss / batch_size
    cls_loss_reduced *= 1

    loss = loc_loss_reduced + cls_loss_reduced

    if self._use_direction_classifier:
        # 生成与 dir_cls_preds 对应的真值 dir_labels
        dir_labels = self.get_direction_target(anchors, targets, use_one_hot=False).view(-1)
        dir_logits = dir_cls_preds.view(-1, 2)
        
        # 设置权值是为了仅仅考虑 labels > 0 的目标（即车这一类）
        weights = (labels > 0).type_as(dir_logits)
        weights /= torch.clamp(weights.sum(-1, keepdim=True), min=1.0)

		# 使用交叉熵做朝向预测的误差损失函数
        dir_loss = weighted_cross_entropy(dir_logits, dir_labels,
                                          weight=weights.view(-1),
                                          avg_factor=1.)

        dir_loss_reduced = dir_loss / batch_size
        dir_loss_reduced *= .2
        loss += dir_loss_reduced

    return dict(rpn_loc_loss=loc_loss_reduced, rpn_cls_loss=cls_loss_reduced, rpn_dir_loss=dir_loss_reduced)

里面有一个很重要的函数 create_target_torch，是用来生成 label 用的, 具体分析在后面。

具体的 loss 构建函数：

def create_loss(self,
                box_preds,                        # torch.Size([2, 200, 176, 14])
                cls_preds,                        # torch.Size([2, 200, 176, 2])
                cls_targets,                      # torch.Size([2, 70400])
                cls_weights,                      # torch.Size([2, 70400])
                reg_targets,                      # torch.Size([2, 70400, 7])
                reg_weights,                      # torch.Size([2, 70400])
                num_class,                        # 1
                use_sigmoid_cls=True,             # True
                encode_rad_error_by_sin=True,     # True
                box_code_size=7):                 # 7

    batch_size = int(box_preds.shape[0])                        # B = 2

    box_preds = box_preds.view(batch_size, -1, box_code_size)   # torch.Size([2, 70400, 7])

    if use_sigmoid_cls:
        cls_preds = cls_preds.view(batch_size, -1, num_class)   # torch.Size([2, 70400, 1])
    else:
        cls_preds = cls_preds.view(batch_size, -1, num_class + 1)

    one_hot_targets = one_hot(
        cls_targets, depth=num_class + 1, dtype=box_preds.dtype) # torch.Size([2, 70400, 2])

    if use_sigmoid_cls:
        one_hot_targets = one_hot_targets[..., 1:]               # torch.Size([2, 70400, 1])
    if encode_rad_error_by_sin:
        box_preds, reg_targets = self.add_sin_difference(box_preds, reg_targets)
        # torch.Size([2, 70400, 7])
        # torch.Size([2, 70400, 7])

    loc_losses = weighted_smoothl1(box_preds, reg_targets, beta=1 / 9., \
                                   weight=reg_weights[..., None], avg_factor=1.)
    cls_losses = weighted_sigmoid_focal_loss(cls_preds, one_hot_targets, \
                                             weight=cls_weights[..., None], avg_factor=1.)

    return loc_losses, cls_losses

3. 生成 label

主要在 create_target_torch 这个函数中，注释和解析如下，

这段代码的作用主要是为了：

生成 anchor 的 label
bbox 回归的 target
同时返回每个 anchor 和每个 gt_bbox 的 iou

# all_anchors          : torch.Size([70400, 7])
# gt_boxes             : torch.Size([num_of_gt_bbox, 7])
# anchor_mask          : torch.Size(70400,)
# gt_classes           : num_of_gt_bbox eg: 14
# similarity_fn        : 
# box_encoding_fn      : 
# matched_threshold    : 0.6
# unmatched_threshold  : 0.45
# positive_fraction    : None
# norm_by_num_examples : False
# box_code_size        : 7

def create_target_torch(all_anchors,
                        gt_boxes,
                        anchor_mask,
                        gt_classes,
                        similarity_fn,
                        box_encoding_fn,
                        matched_threshold=0.6,
                        unmatched_threshold=0.45,
                        positive_fraction=None,
                        rpn_batch_size=300,
                        norm_by_num_examples=False,
                        box_code_size=7):

    # torch.set_printoptions(threshold=np.inf)
	# 这个函数的作用是将 anchor_mask 映射回 anchor 
    def _unmap(data, count, inds, fill=0):

        # ----------------------------
        # data  : label
        # count : anchor.shape
        # inds  : mask
        # ---------------------------

        """ Unmap a subset of item (data) back to the original set of items (of
        size count) """
        if data.dim() == 1:
            ret = data.new_full((count,), fill)
            ret[inds] = data
        else:
            new_size = (count,) + data.size()[1:]
            ret = data.new_full(new_size, fill)
            ret[inds, :] = data
        return ret

    # value: 70400
    total_anchors = all_anchors.shape[0]

    # go
    if anchor_mask is not None:
        #inds_inside = np.where(anchors_mask)[0]  # prune_anchor_fn(all_anchors)

        # value: 22007
        anchors = all_anchors[anchor_mask, :]

        if not isinstance(matched_threshold, float):
            matched_threshold = matched_threshold[anchor_mask]
        if not isinstance(unmatched_threshold, float):
            unmatched_threshold = unmatched_threshold[anchor_mask]
    else:
        anchors = all_anchors
        #inds_inside = None

    # 22007
    num_inside = len(torch.nonzero(anchor_mask)) if anchor_mask is not None else total_anchors

    if gt_classes is None:
        gt_classes = torch.ones([gt_boxes.shape[0]], dtype=torch.int64, device=gt_boxes.device)

    # Compute anchor labels:
    # label=1 is positive, 0 is negative, -1 is don't care (ignore)
    # shape = [22007,] value = -1
    labels = torch.empty((num_inside,), dtype=torch.int64, device=gt_boxes.device).fill_(-1)
    gt_ids = torch.empty((num_inside,), dtype=torch.int64, device=gt_boxes.device).fill_(-1)

    if len(gt_boxes) > 0 and anchors.shape[0] > 0:
        # Compute overlaps between the anchors and the gt boxes overlaps
        # 计算 anchor 和 gt_bbox 的交并比 
        anchor_by_gt_overlap = similarity_fn(anchors, gt_boxes)           # torch.Size([22007, 14])

        # add for test
        # for_test_anchor_by_gt_overlap = similarity_fn(anchors[9300:9303,:], gt_boxes)


        # Map from anchor to gt box that has highest overlap
        anchor_to_gt_argmax = anchor_by_gt_overlap.argmax(dim=1)          
        # shape：22007 
        # 计算每个 anchor 和 gt_bbox 的 iou 最大值的索引 
        # 这里的 dim = 1 就是第1个维度 22007

        # For each anchor, amount of overlap with most overlapping gt box
        anchor_to_gt_max = anchor_by_gt_overlap[torch.arange(num_inside), 
                                                anchor_to_gt_argmax]
        # 计算每个 anchor 和 gt_bbox 的 iou 最大值

        # Map from gt box to an anchor that has highest overlap
        gt_to_anchor_argmax = anchor_by_gt_overlap.argmax(dim=0)
        # 计算每个 gt_bbox 和 anchor 的 iou 最大值的索引 
        # 这里的 dim = 0 就是第0个维度
        # shape: 14


        # For each gt box, amount of overlap with most overlapping anchor
        gt_to_anchor_max = anchor_by_gt_overlap[
            gt_to_anchor_argmax,
            torch.arange(anchor_by_gt_overlap.shape[1])]
        # 计算每个 gt_bbox 和 anchor 的 iou 最大值

        # must remove gt which doesn't match any anchor.
        empty_gt_mask = gt_to_anchor_max == 0
        gt_to_anchor_max[empty_gt_mask] = -1

        # Find all anchors that share the max overlap amount
        # (this includes many ties)
        anchors_with_max_overlap = torch.nonzero(
            anchor_by_gt_overlap == gt_to_anchor_max)[:,0]
        # 找到和 gt_bbox 有最大 iou 的 anchor
        # tensor([ 6287,  7063,  9302,  9530,  9571, 10225, 11481, 13080, 14509, 15080,
        #         15082, 15293, 18273, 18740, 21316], device='cuda:0')

        # for test
        # for_test_anchors_with_max_overlap = torch.nonzero(
        #    for_test_anchor_by_gt_overlap == gt_to_anchor_max)[:, 0]

        # Fg label: for each gt use anchors with highest overlap
        # (including ties)
        gt_inds_force = anchor_to_gt_argmax[anchors_with_max_overlap]
        # 15
        # tensor([ 6, 10, 12, 11, 13,  7,  9,  5,  3,  2,  2,  8,  1,  0,  4],
        #        device='cuda:0')
        # 找到这些 anchor 和 哪些 gt_bbox 对应

        labels[anchors_with_max_overlap] = gt_classes[gt_inds_force] # 做对应的label 最大 iou 的 anchoor 置为 1
        gt_ids[anchors_with_max_overlap] = gt_inds_force             # 保存 对应的 gt 的 序号

        # Fg label: above threshold IOU
        pos_inds = anchor_to_gt_max >= matched_threshold             # 找所有 anchor 大于阈值的
        gt_inds = anchor_to_gt_argmax[pos_inds]                      # 记录这些 anchor 对应 gt_bbox 的下标
        # 有 67 个 ，anchor 和 gt_bbox 的 iou 大于阈值
        # tensor([ 6,  6,  6,  6,  6,  6, 10, 10, 10, 10, 10, 10, 12, 12, 12, 12, 11, 11,
        #         12, 11, 11, 13, 13, 11, 13, 13,  7,  7,  7,  7,  7,  9,  9,  9,  9,  5,
        #          5,  5,  5,  5,  3,  3,  3,  3,  2,  2,  2,  2,  8,  8,  8,  8,  1,  1,
        #          1,  1,  1,  0,  0,  0,  0,  0,  4,  4,  4,  4,  4], device='cuda:0')
        labels[pos_inds] = gt_classes[gt_inds]                        # 对应的 label 设置为 1
        gt_ids[pos_inds] = gt_inds                                    # 保存 对应的 gt 的 序号

        # bg_inds = np.where(anchor_to_gt_max < unmatched_threshold)[0]
        bg_inds = torch.nonzero(anchor_to_gt_max < unmatched_threshold)[:, 0]
        # 找到 小于阈值的 anchor 的 index
    else:
        bg_inds = torch.arange(num_inside)

    #fg_inds = np.where(labels > 0)[0]
    fg_inds = torch.nonzero(labels > 0)[:, 0]
    # 找到所有前景 anchor 的 index
    # tensor([ 6283,  6285,  6287,  6289,  6291,  6498,  6852,  6854,  7061,  7063,
    #          7268,  7270,  8883,  9094,  9300,  9302,  9324,  9326,  9508,  9530,
    #          9532,  9571,  9573,  9736,  9777,  9779,  9827, 10028, 10225, 10227,
    #         10424, 11481, 11483, 11757, 11759, 13078, 13080, 13082, 13084, 13366,
    #         14267, 14509, 14511, 14750, 15078, 15080, 15082, 15084, 15291, 15293,
    #         15295, 15553, 18009, 18269, 18271, 18273, 18275, 18493, 18495, 18738,
    #         18740, 18742, 21312, 21314, 21316, 21318, 21389], device='cuda:0')

    # subsample positive labels if we have too many
    if positive_fraction is not None:
        num_fg = int(positive_fraction * rpn_batch_size)
        if len(fg_inds) > num_fg:
            disable_inds = npr.choice(
                fg_inds, size=(len(fg_inds) - num_fg), replace=False)
            labels[disable_inds] = -1
            #fg_inds = np.where(labels > 0)[0]
            fg_inds = torch.where(labels > 0)[:, 0]

        # subsample negative labels if we have too many
        # (samples with replacement, but since the set of bg inds is large most
        # samples will not have repeats)
        num_bg = rpn_batch_size - np.sum(labels > 0)
        # print(num_fg, num_bg, len(bg_inds) )
        if len(bg_inds) > num_bg:
            enable_inds = bg_inds[npr.randint(len(bg_inds), size=num_bg)]
            labels[enable_inds] = 0
    else:
        if len(gt_boxes) == 0 or anchors.shape[0] == 0:
            labels[:] = 0
        else:
            labels[bg_inds] = 0   # 背景点的 label 设置为 0
            # re-enable anchors_with_max_overlap
            labels[anchors_with_max_overlap] = gt_classes[gt_inds_force]

	# 生成 target
    bbox_targets = torch.zeros(
        (num_inside, box_code_size), dtype=all_anchors.dtype, device=gt_boxes.device) # torch.Size([22007, 7])
	
	# 对前景的 anchor 进行编码 
    if len(gt_boxes) > 0 and anchors.shape[0] > 0:
        bbox_targets[fg_inds, :] = box_encoding_fn(
            gt_boxes[anchor_to_gt_argmax[fg_inds], :], anchors[fg_inds, :])
    # bbox_targets[fg_inds, :].shape : torch.Size([67, 7])

    bbox_outside_weights = torch.zeros((num_inside,), dtype=all_anchors.dtype, device=gt_boxes.device)

    # uniform weighting of examples (given non-uniform sampling)
    if norm_by_num_examples:
        num_examples = torch.sum(labels >= 0)  # neg + pos
        num_examples = np.maximum(1.0, num_examples)
        bbox_outside_weights[labels > 0] = 1.0 / num_examples
    else:
        bbox_outside_weights[labels > 0] = 1.0

    # Map up to original set of anchors
    if anchor_mask is not None:
        labels = _unmap(labels, total_anchors, anchor_mask, fill=-1)
        bbox_targets = _unmap(bbox_targets, total_anchors, anchor_mask, fill=0)

    return (labels, bbox_targets, anchor_to_gt_max)
    # labels.shape       : torch.Size([70400,])
    # bbox_targets.shape : torch.Size([70400, 7])
    # anchor_to_gt_max   : 22007 

	# 关于 label
	# 前景是 1
    # 背景是 0
    # 没用的是 -1

ok 未完待续。

你可能感兴趣的:(目标检测)

【目标检测数据集】卡车数据集1073张VOC+YOLO格式熬夜写代码的平头哥∰ 目标检测 YOLO 人工智能
数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：1073标注数量(xml文件个数)：1073标注数量(txt文件个数)：1073标注类别数：1标注类别名称:["truck"]每个类别标注的框数：truck框数=1120总框数：1120使用标注工具：labelImg标注
番茄西红柿叶子病害分类数据集12882张11类别 futureflsl 数据集分类数据挖掘人工智能
数据集类型：图像分类用，不可用于目标检测无标注文件数据集格式：仅仅包含jpg图片，每个类别文件夹下面存放着对应图片图片数量(jpg文件个数)：12882分类类别数：11类别名称:["Bacterial_Spot_Bacteria","Early_Blight_Fungus","Healthy","Late_Blight_Water_Mold","Leaf_Mold_Fungus","Powdery
[数据集][目标检测]汽车头部尾部检测数据集VOC+YOLO格式5319张3类别 FL1623863129 数据集目标检测汽车 YOLO
数据集制作单位：未来自主研究中心(FIRC)版权单位：未来自主研究中心(FIRC)版权声明：数据集仅仅供个人使用，不得在未授权情况下挂淘宝、咸鱼等交易网站公开售卖,由此引发的法律责任需自行承担数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：5319标注数量(xml文件
CV、NLP、数据控掘推荐、量化海的那边- AI算法自然语言处理人工智能
下面是对CV（计算机视觉）、NLP（自然语言处理）、数据挖掘推荐和量化的简要概述及其应用领域的介绍：1.CV（计算机视觉，ComputerVision）定义：计算机视觉是一门让计算机能够从图像或视频中提取有用信息，并做出决策的学科。它通过模拟人类的视觉系统来识别、处理和理解视觉信息。主要任务：图像分类：识别图像中的物体并分类，比如猫、狗、车等。目标检测：在图像或视频中定位并识别多个对象，如人脸检测
【目标检测数据集】番茄叶片病害数据集13940张9类VOC+YOLO格式熬夜写代码的平头哥∰ 数据集目标检测 YOLO 目标跟踪
数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：13946标注数量(xml文件个数)：13946标注数量(txt文件个数)：13946标注类别数：9标注类别名称:["EarlyBlight","Healthy","LateBlight","LeafMiner","Le
[数据集][目标检测]血细胞检测数据集VOC+YOLO格式2757张4类别 FL1623863129 数据集目标检测 YOLO 人工智能
数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：2757标注数量(xml文件个数)：2757标注数量(txt文件个数)：2757标注类别数：4标注类别名称:["Platelets","RBC","WBC","sicklecell"]每个类别标注的框数：Platelet
目标检测YOLO系列从入门到精通技术详解100篇-【目标检测】工业相机格图素书数码相机目标检测人工智能
目录知识储备深度相机1TOF2双目视觉3结构光4智能门锁应用5手机应用算法原理相机的成像与标定模型相机标定的实施·标定过程的算法实施相机标定的扩展CCD工业相机、镜头倍率及相关参数计算方法知识储备深度相机1TOF1.1Kinectv2Kinectv2是Microsoft在2014年发售的，如图1-1所示。相比于Kinectv1在硬件和软件上作出了很大的进化，且在深度测量的系统和非系统误差方面表现出
【小贪】项目实战——Zero-shot根据文字提示分割出图片目标掩码贪钱算法还我头发 #Deep Learning #Computer Vision AI 目标检测深度学习 python 语义分割 Zero-shot
目标描述给定RGB视频或图片，目标是分割出图像中的指定目标掩码。我们需要复现两个Zero-shot的开源项目，分别为IDEA研究院的GroundingDINO和Facebook的SAM。首先使用目标检测方法GroundingDINO，输入想检测目标的文字提示，可以获得目标的anchorbox。将上一步获得的box信息作为SAM的提示，分割出目标mask。具体效果如下（测试数据来自VolumeDef
yolov5 +gui界面+单目测距实现对图片视频摄像头的测距毕设宇航 QQ767172261 yolov5 单目测距
可实现对图片，视频，摄像头的检测项目概述本项目旨在实现一个集成了YOLOv5目标检测算法、图形用户界面（GUI）以及单目测距功能的系统。该系统能够对图片、视频或实时摄像头输入进行目标检测，并估算目标的距离。通过结合YOLOv5的强大检测能力和单目测距技术，系统能够在多种应用场景中提供高效、准确的目标检测和测距功能。技术栈YOLOv5：用于目标检测的深度学习模型。OpenCV：用于图像处理和单目测距
目标检测-YOLOv3 wydxry 深度学习目标检测 YOLO 深度学习
YOLOv3介绍YOLOv3(YouOnlyLookOnce,Version3)是YOLO系列目标检测模型的第三个版本，相较于YOLOv2有了显著的改进和增强，尤其在检测速度和精度上表现优异。YOLOv3的设计目标是在保持高速的前提下提升检测的准确性和稳定性。下面是对YOLOv3改进和优势的介绍，以及YOLOv3核心部分的代码展示。相比YOLOv2的改进与优势多尺度特征金字塔YOLOv3引入了FP
SSD目标检测系统月见樽
首发于个人博客系统结构system.pngSSD识别系统也是一种单步物体识别系统，即将提取物体位置和判断物体类别融合在一起进行，其最主要的特点是识别器用于判断物体的特征不仅仅来自于神经网络的输出，还来自于神经网络的中间结果。该系统分为以下几个部分：神经网络部分：用作特征提取器，提取图像特征识别器：根据神经网络提取的特征，生成包含物品位置和类别信息的候选框（使用卷积实现）后处理：对识别器提取出的候选
深度学习目标检测入门COCO数据集日暮途远z 深度学习目标检测人工智能
常见数据集类型：COCO数据集：Pytorch加载COCO数据集：COCO数据集的读取COCO_dataset=torchvision.datasets.CocoDetection(root="./dataset/val2017",annFile="./instances_val2017/instances_val2017.json")root(strorpathlib.Path)–Rootdir
[数据集][目标检测]街道乱堆垃圾检测数据集VOC+YOLO格式94张1类别 FL1623863129 数据集目标检测 YOLO 人工智能
数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：94标注数量(xml文件个数)：94标注数量(txt文件个数)：94标注类别数：1标注类别名称:["baolu"]每个类别标注的框数：baolu框数=107总框数：107使用标注工具：labelImg标注规则：对类别进行
YOLOv8改进 | 检测头篇 | YOLOv8引入DynamicHead检测头小李学AI YOLOv8有效涨点专栏 YOLO 深度学习目标检测计算机视觉机器学习人工智能
1.DynamicHead描述1.1摘要：在目标检测中，定位和分类相结合的复杂性导致了各种方法的蓬勃发展。以往的工作试图提高各种目标检测头的性能，但未能呈现出统一的观点。本文根据目标检测的特点，推导了一种新的动态头部框架，将目标检测头部与注意力统一起来。该方法通过在特征层次间、空间位置间和输出通道内协调组合多种自注意机制，在不增加计算开销的情况下显著提高了目标检测头的表示能力。进一步的实验表明，本
目标检测-YOLOv1 wydxry 深度学习目标检测 YOLO 人工智能
YOLOv1介绍YOLOv1（YouOnlyLookOnceversion1）是一种用于目标检测的深度学习算法，由JosephRedmon等人于2016年提出。它基于单个卷积神经网络，将目标检测任务转化为一个回归问题，通过在图像上划分网格并预测每个网格中是否包含目标以及目标的位置和类别来实现目标检测。YOLOv1的主要特点包括：快速的检测速度：相比于传统的目标检测算法，YOLOv1具有更快的检测速
[数据集][目标检测]人脸口罩佩戴目标检测数据集VOC+YOLO格式8068张3类别 FL1623863129 数据集目标检测 YOLO 目标跟踪
数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：8068标注数量(xml文件个数)：8068标注数量(txt文件个数)：8068标注类别数：3标注类别名称:["face_with_mask","face_without_mask","mask"]每个类别标注的框数：f
葡萄检测-目标检测数据集（包括VOC格式、YOLO格式）数据集_深度学习目标检测 YOLO 人工智能计算机视觉葡萄
葡萄检测-目标检测数据集（包括VOC格式、YOLO格式）数据集：链接：https://pan.baidu.com/s/1YMwAaSJc8H5SI0f8RVSidw?pwd=iygs提取码：iygs数据集信息介绍：共有1646张图像和一一对应的标注文件标注文件格式提供了两种，包括VOC格式的xml文件和YOLO格式的txt文件。标注的对象共有以下几种：[‘grape’]标注框的数量信息如下：（标注
OpenCV项目实战-深度学习去阴影-图像去阴影阿利同学 opencv 深度学习人工智能阴影去除图像去阴影
往期热门博客项目回顾：计算机视觉项目大集合改进的yolo目标检测-测距测速路径规划算法图像去雨去雾+目标检测+测距项目交通标志识别项目yolo系列-重磅yolov9界面-最新的yolo姿态识别-3d姿态识别深度学习小白学习路线//正文开始！图像去阴影算法旨在改善图像质量并恢复阴影下物体的真实颜色与亮度这对于许多计算机视觉任务如物体识别、跟踪以及增强现实等至关重要。以下是一些图像去阴影算法的基本概述
目标检测-YOLOv4 wydxry 深度学习目标检测 YOLO 目标跟踪
YOLOv4介绍YOLOv4是YOLO系列的第四个版本，继承了YOLOv3的高效性，并通过大量优化和改进，在目标检测任务中实现了更高的精度和速度。相比YOLOv3，YOLOv4在框架设计、特征提取、训练策略等方面进行了全面升级。它在保持实时检测的同时，显著提升了检测性能，尤其在复杂场景中的表现尤为出色。相比YOLOv3的改进与优势改进的Backbone(CSPDarknet-53)YOLOv4使用
[数据集][目标检测]井盖丢失未盖破损检测数据集VOC+YOLO格式2890张5类别 FL1623863129 数据集目标检测 YOLO 人工智能
数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：2890标注数量(xml文件个数)：2890标注数量(txt文件个数)：2890标注类别数：5标注类别名称:["broke","circle","good","lose","uncovered"]每个类别标注的框数：br
YOLOv8改进更换轻量级网络结构学yolo的小白 Upgrade YOLOv8进阶 YOLO 目标检测深度学习
一、GhostNet论文论文地址：1911.11907.pdf(arxiv.org)二、GhostNet结构GhostNet是一种高效的目标检测网络，具有较低的计算复杂度和较高的准确性。该网络采用了轻量级的架构，可以在计算资源有限的设备上运行，并能够快速地实时检测图像中的目标物体。GhostNet基于MobileNetV3的设计思路，采用了Ghost模块来减少网络参数数量，从而减少计算量并提高模型
【Python】成功解决TypeError: list indices must be integers or slices, not str 高斯小哥 BUG解决方案合集 python list 新手入门学习 debug
【Python】成功解决TypeError:listindicesmustbeintegersorslices,notstr欢迎进入我的个人主页，我是高斯小哥！博主档案：广东某985本硕，SCI顶刊一作，深耕深度学习多年，熟练掌握PyTorch框架。技术专长：擅长处理各类深度学习任务，包括但不限于图像分类、图像重构(去雾\去模糊\修复)、目标检测、图像分割、人脸识别、多标签分类、重识别(行人\车辆
LeYOLO 用于目标检测的新型可扩展和高效CNN架构 | 最新轻量化SOTA! 5GFLOP下无对手！迪菲赫尔曼 YOLOv8改进实战目标检测 cnn 架构 pytorch 深度学习轻量化
本改进已集成到YOLOv8-Magic框架。论文地址：https://arxiv.org/pdf/2406.14239代码地址：https://github.com/LilianHollard/LeYOLO/tree/main在深度神经网络中，计算效率对于目标检测至关重要，尤其是在新型模型更倾向于速度而非计算效率（浮点运算次数，FLOP）的情况下。这种演变在一定程度上忽视了嵌入式和面向移动的AI目
Python 使用 Detectron2 进行目标检测 (Detectron2, CenterNet2, Detic) Eric Woo X Python AI Ubuntu python 目标检测开发语言
代码说明代码主要是一个用来演示如何使用Detectron2进行目标检测的脚本。它可以从摄像头或视频文件中读取图像，并应用指定的配置文件进行目标检测。其中，Detectron2结合了CenterNet2和Detic进行目标检测。主要库介绍Detectron2Detectron2是由FacebookAIResearch开发的一个用于目标检测和实例分割的开源库。它提供了一系列预训练模型和灵活的配置系统，
Transformer+目标检测，这一篇入门就够了 BIT可达鸭 ▶深度学习-计算机视觉 transformer 深度学习目标检测计算机视觉自然语言处理
VisionTransformerforObjectDetection本文作者：Encoder-Decoder简介：Encoder-Decoder的缺陷：Attention机制：Self-Attention机制：Multi-HeadAttention：Transformer结构：图像分类之ViT：图像分类之PyramidViT：目标检测之DETR：目标检测之DeformableDETR：本文作者：
目标检测-YOLOv2 wydxry 深度学习目标检测 YOLO 人工智能
YOLOv2介绍YOLOv2（YouOnlyLookOnceversion2）是一种用于目标检测的深度学习模型，由JosephRedmon等人于2016年提出，并详细论述在其论文《YOLO9000:Better,Faster,Stronger》中。YOLOv2在保持高速检测的同时，显著提升了检测的精度和泛化能力，成为实时目标检测领域的重要算法之一。核心原理YOLOv2的核心原理是将目标检测问题转化
【计算机视觉前沿研究热点顶会】ECCV 2024中目标检测有关的论文平安顺遂事事如意顶刊顶会论文合集计算机视觉目标检测人工智能 3d 目标跟踪
整值训练和尖峰驱动推理脉冲神经网络用于高性能和节能的目标检测与人工神经网络(ANN)相比，脑激励的脉冲神经网络(SNN)具有生物合理性和低功耗的优势。由于SNN的性能较差，目前的应用仅限于简单的分类任务。在这项工作中，我们专注于弥合人工神经网络和神经网络在目标检测方面的性能差距。我们的设计围绕着网络架构和尖峰神经元。当行人检测遇到多模态学习时：通才模型和基准数据集近年来，利用不同传感器模态(如RG
目标检测——YOLOv8模型预测结果张飞飞飞飞飞目标检测 YOLO 人工智能
fromultralyticsimportYOLOmodel_path=r'/home/zhangh/project1/workproject/YOLOv8/ultralytics/runs/train/2024723_yolov8n5/weights/best.pt'img_path=r'worker_data/images/val/%E9%93%B2%E6%96%97%E5%9D%90%E4%
基于yolov8的口罩佩戴检测系统python源码+onnx模型+评估指标曲线+精美GUI界面 FL1623863129 深度学习 python
【算法介绍】基于YOLOv8的口罩佩戴检测系统是一款利用深度学习技术，特别是YOLOv8算法，实现高效、准确检测人脸是否佩戴口罩的系统。YOLOv8作为YOLO系列算法的最新版本，在检测速度和准确性上进行了显著优化，能够实时处理图像和视频数据。该系统通过训练大量标注了人脸和口罩状态（包括戴口罩、未戴口罩）的图片数据，构建了一个强大的目标检测模型。在实际应用中，该系统可以部署在公共场所如机场、车站、
[数据集][目标检测]卫星遥感舰船检测数据集VOC+YOLO格式2238张17类别 FL1623863129 数据集目标检测 YOLO 人工智能
数据集格式：PascalVOC格式+YOLO格式(不包含分割路径的txt文件，仅仅包含jpg图片以及对应的VOC格式xml文件和yolo格式txt文件)图片数量(jpg文件个数)：2238标注数量(xml文件个数)：2238标注数量(txt文件个数)：2238标注类别数：17标注类别名称:[“AircraftCarrier”,“AuxiliaryShips”,“Cargo”,“Commander”
[黑洞与暗粒子]没有光的世界 comsci
无论是相对论还是其它现代物理学,都显然有个缺陷,那就是必须有光才能够计算但是,我相信,在我们的世界和宇宙平面中,肯定存在没有光的世界.... 那么,在没有光的世界,光子和其它粒子的规律无法被应用和考察,那么以光速为核心的 &nbs
jQuery Lazy Load 图片延迟加载 aijuans jquery
基于 jQuery 的图片延迟加载插件，在用户滚动页面到图片之后才进行加载。对于有较多的图片的网页，使用图片延迟加载，能有效的提高页面加载速度。版本： jQuery v1.4.4+ jQuery Lazy Load v1.7.2 注意事项：需要真正实现图片延迟加载，必须将真实图片地址写在 data-original 属性中。若 src
使用Jodd的优点 Kai_Ge jodd
1. 简化和统一 controller ，抛弃 extends SimpleFormController ，统一使用 implements Controller 的方式。 2. 简化 JSP 页面的 bind, 不需要一个字段一个字段的绑定。 3. 对 bean 没有任何要求，可以使用任意的 bean 做为 formBean。使用方法简介
jpa Query转hibernate Query 120153216 Hibernate
public List<Map> getMapList(String hql, Map map) { org.hibernate.Query jpaQuery = entityManager.createQuery(hql); if (null != map) { for (String parameter : map.keySet()) { jp
Django_Python3添加MySQL/MariaDB支持 2002wmj mariaDB
现状首先，[email protected] 中默认的引擎为 django.db.backends.mysql 。但是在Python3中如果这样写的话，会发现 django.db.backends.mysql 依赖 MySQLdb[5] ，而 MySQLdb 又不兼容 Python3 于是要找一种新的方式来继续使用MySQL。 MySQL官方的方案首先据MySQL文档[3]说，自从MySQL
在SQLSERVER中查找消耗IO最多的SQL 357029540 SQL Server
返回做IO数目最多的50条语句以及它们的执行计划。 select top 50 (total_logical_reads/execution_count) as avg_logical_reads, (total_logical_writes/execution_count) as avg_logical_writes, (tot
spring UnChecked 异常官方定义！ 7454103 spring
如果你接触过spring的事物管理！那么你必须明白 spring的非捕获异常！即 unchecked 异常！因为 spring 默认这类异常事物自动回滚！！ public static boolean isCheckedException(Throwable ex) { return !(ex instanceof RuntimeExcep
mongoDB 入门指南、示例 adminjun java mongodb 操作
一、准备工作 1、下载mongoDB 下载地址：http://www.mongodb.org/downloads 选择合适你的版本相关文档：http://www.mongodb.org/display/DOCS/Tutorial 2、安装mongoDB A、不解压模式：将下载下来的mongoDB-xxx.zip打开，找到bin目录，运行mongod.exe就可以启动服务，默
CUDA 5 Release Candidate Now Available aijuans CUDA
The CUDA 5 Release Candidate is now available at http://developer.nvidia.com/<wbr></wbr>cuda/cuda-pre-production. Now applicable to a broader set of algorithms, CUDA 5 has advanced fe
Essential Studio for WinRT网格控件测评 Axiba JavaScript html5
Essential Studio for WinRT界面控件包含了商业平板应用程序开发中所需的所有控件，如市场上运行速度最快的grid 和chart、地图、RDL报表查看器、丰富的文本查看器及图表等等。同时，该控件还包含了一组独特的库，用于从WinRT应用程序中生成Excel、Word以及PDF格式的文件。此文将对其另外一个强大的控件——网格控件进行专门的测评详述。网格控件功能 1、
java 获取windows系统安装的证书或证书链 bewithme windows
有时需要获取windows系统安装的证书或证书链，比如说你要通过证书来创建java的密钥库。有关证书链的解释可以查看此处。 public static void main(String[] args) { SunMSCAPI providerMSCAPI = new SunMSCAPI(); S
NoSQL数据库之Redis数据库管理(set类型和zset类型) bijian1013 redis 数据库 NoSQL
4.sets类型 Set是集合，它是string类型的无序集合。set是通过hash table实现的，添加、删除和查找的复杂度都是O(1)。对集合我们可以取并集、交集、差集。通过这些操作我们可以实现sns中的好友推荐和blog的tag功能。 sadd：向名称为key的set中添加元
异常捕获何时用Exception，何时用Throwable bingyingao
用Exception的情况 try { //可能发生空指针、数组溢出等异常 } catch (Exception e) {
【Kafka四】Kakfa伪分布式安装 bit1129 kafka
在http://bit1129.iteye.com/blog/2174791一文中，实现了单Kafka服务器的安装，在Kafka中，每个Kafka服务器称为一个broker。本文简单介绍下，在单机环境下Kafka的伪分布式安装和测试验证 1. 安装步骤 Kafka伪分布式安装的思路跟Zookeeper的伪分布式安装思路完全一样，不过比Zookeeper稍微简单些(不
Project Euler bookjovi haskell
Project Euler是个数学问题求解网站，网站设计的很有意思，有很多problem，在未提交正确答案前不能查看problem的overview，也不能查看关于problem的discussion thread，只能看到现在problem已经被多少人解决了，人数越多往往代表问题越容易。看看problem 1吧： Add all the natural num
Java-Collections Framework学习与总结-ArrayDeque BrokenDreams Collections
表、栈和队列是三种基本的数据结构，前面总结的ArrayList和LinkedList可以作为任意一种数据结构来使用，当然由于实现方式的不同，操作的效率也会不同。这篇要看一下java.util.ArrayDeque。从命名上看
读《研磨设计模式》-代码笔记-装饰模式-Decorator bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ import java.io.BufferedOutputStream; import java.io.DataOutputStream; import java.io.FileOutputStream; import java.io.Fi
Maven学习(一) chenyu19891124 Maven私服
学习一门技术和工具总得花费一段时间，5月底6月初自己学习了一些工具，maven+Hudson+nexus的搭建，对于maven以前只是听说，顺便再自己的电脑上搭建了一个maven环境，但是完全不了解maven这一强大的构建工具，还有ant也是一个构建工具，但ant就没有maven那么的简单方便，其实简单点说maven是一个运用命令行就能完成构建，测试，打包，发布一系列功
[原创]JWFD工作流引擎设计----节点匹配搜索算法(用于初步解决条件异步汇聚问题) 补充 comsci 算法工作 PHP 搜索引擎嵌入式
本文主要介绍在JWFD工作流引擎设计中遇到的一个实际问题的解决方案，请参考我的博文"带条件选择的并行汇聚路由问题"中图例A2描述的情况(http://comsci.iteye.com/blog/339756),我现在把我对图例A2的一个解决方案公布出来，请大家多指点节点匹配搜索算法(用于解决标准对称流程图条件汇聚点运行控制参数的算法) 需要解决的问题：已知分支
Linux中用shell获取昨天、明天或多天前的日期 daizj linux shell 上几年昨天获取上几个月
在Linux中可以通过date命令获取昨天、明天、上个月、下个月、上一年和下一年 # 获取昨天 date -d 'yesterday' # 或 date -d 'last day' # 获取明天 date -d 'tomorrow' # 或 date -d 'next day' # 获取上个月 date -d 'last month' #
我所理解的云计算 dongwei_6688 云计算
在刚开始接触到一个概念时，人们往往都会去探寻这个概念的含义，以达到对其有一个感性的认知，在Wikipedia上关于“云计算”是这么定义的，它说： Cloud computing is a phrase used to describe a variety of computing co
YII CMenu配置 dcj3sjt126com yii
Adding id and class names to CMenu We use the id and htmlOptions to accomplish this. Watch. //in your view $this->widget('zii.widgets.CMenu', array( 'id'=>'myMenu', 'items'=>$this-&g
设计模式之静态代理与动态代理 come_for_dream 设计模式
静态代理与动态代理代理模式是java开发中用到的相对比较多的设计模式，其中的思想就是主业务和相关业务分离。所谓的代理设计就是指由一个代理主题来操作真实主题，真实主题执行具体的业务操作，而代理主题负责其他相关业务的处理。比如我们在进行删除操作的时候需要检验一下用户是否登陆，我们可以删除看成主业务，而把检验用户是否登陆看成其相关业务
【转】理解Javascript 系列 gcc2ge JavaScript
理解Javascript_13_执行模型详解摘要: 在《理解Javascript_12_执行模型浅析》一文中,我们初步的了解了执行上下文与作用域的概念，那么这一篇将深入分析执行上下文的构建过程，了解执行上下文、函数对象、作用域三者之间的关系。函数执行环境简单的代码:当调用say方法时，第一步是创建其执行环境，在创建执行环境的过程中，会按照定义的先后顺序完成一系列操作:1.首先会创建一个
Subsets II hcx2013 set
Given a collection of integers that might contain duplicates, nums, return all possible subsets. Note: Elements in a subset must be in non-descending order. The solution set must not conta
Spring4.1新特性——Spring缓存框架增强 jinnianshilongnian spring4
目录 Spring4.1新特性——综述 Spring4.1新特性——Spring核心部分及其他 Spring4.1新特性——Spring缓存框架增强 Spring4.1新特性——异步调用和事件机制的异常处理 Spring4.1新特性——数据库集成测试脚本初始化 Spring4.1新特性——Spring MVC增强 Spring4.1新特性——页面自动化测试框架Spring MVC T
shell嵌套expect执行命令 liyonghui160com
一直都想把expect的操作写到bash脚本里,这样就不用我再写两个脚本来执行了,搞了一下午终于有点小成就,给大家看看吧. 系统:centos 5.x 1.先安装expect yum -y install expect 2.脚本内容: cat auto_svn.sh #!/bin/bash
Linux实用命令整理 pda158 linux
0. 基本命令　　linux 基本命令整理　　1. 压缩解压　　tar -zcvf a.tar.gz a #把a压缩成a.tar.gz 　　tar -zxvf a.tar.gz #把a.tar.gz解压成a 　　2. vim小结　　2.1 vim替换　　:m,ns/word_1/word_2/gc
独立开发人员通向成功的29个小贴士 shoothao 独立开发
概述：本文收集了关于独立开发人员通向成功需要注意的一些东西,对于具体的每个贴士的注解有兴趣的朋友可以查看下面标注的原文地址。明白你从事独立开发的原因和目的。保持坚持制定计划的好习惯。万事开头难，第一份订单是关键。培养多元化业务技能。提供卓越的服务和品质。谨小慎微。营销是必备技能。学会组织，有条理的工作才是最有效率的。 “独立
JAVA中堆栈和内存分配原理 uule java
1、栈、堆 1.寄存器：最快的存储区, 由编译器根据需求进行分配,我们在程序中无法控制.2. 栈：存放基本类型的变量数据和对象的引用，但对象本身不存放在栈中，而是存放在堆（new 出来的对象）或者常量池中（字符串常量对象存放在常量池中。）3. 堆：存放所有new出来的对象。4. 静态域：存放静态成员（static定义的）5. 常量池：存放字符串常量和基本类型常量（public static f