sunny0660

avod_源码记录

AVOD_源码记录

AVOD代码框架
代码细节
1. 预生成数据
  1. 调用链
  2. 核心部分
2. 模型训练
  1. 调用链
  2. 核心部分

AVOD代码框架

主要分为以下几个部分:

预生成数据
Train
Evaluate+Infer

代码细节

预生成数据

用于生成rpn网络的输入数据：包含类聚类的anchor大小信息以及具体每个sample的anchor的生成的anchor信息

调用链

base_dir = avod/
config = avod/avod/configs/mb_preprocessing/rpn_cars(cyclists,pedestrians,people).config
主要的相关模块调用：
scripts/preprocessing/gen_min_batches.py->avod/builders/dataset_builder.py(build_kitti_dataset)->avod/datasets/kitti/kitti_dataset.py(KittiDataset)->avod/datasets/kitti/kitti_utils.py(KittiUtils)->avod/core/mini_batch_utils.py(MiniBatchUtils.preprocess_rpn_mini_batches)->avod/core/mini_batch_preprocessor.py(MiniBatchPreprocessor.preprocess->avod/core/anchor_generator/grid_anchor_3d_generator.py(GridAnchor3dGenerator.generate)

核心部分

数据前处理:mini_batch Anchor生成

Avod数据前处理gen_minbacth包括两个部分：生成不同类的size的cluster结果；利用聚类结果生成不同类的Anchor信息，作为RPN的输入数据

Anchor信息具体为：[max_gt_2d_iou, max_gt_3d_iou, (6 x offsets), class_index]，anchor对应的gt_iou(2d和3d)，anchor偏移值，对应类的index

具体步骤为:

先生成anchor_stride（默认为0.5）的3d anchor
生成voxel 2d图，进行empty-anchor的过滤
anchors与gt进行iou的计算，确定与生成的anchor iou最高的类，更新offsets与class_index

核心代码如下：

# mini_batch_preprocessor.py:49
def preprocess(self, indices):
        """Preprocesses anchor info and saves info to files

        Args:
            indices (int array): sample indices to process.
                If None, processes all samples
        """
        # Get anchor stride for class，默认为0.5
        anchor_strides = self._anchor_strides

        dataset = self._dataset
        dataset_utils = self._dataset.kitti_utils
        classes_name = dataset.classes_name

        # Make folder if it doesn't exist yet
        output_dir = self.mini_batch_utils.get_file_path(classes_name,
                                                         anchor_strides,
                                                         sample_name=None)
        os.makedirs(output_dir, exist_ok=True)

        # Get clusters for class
        # 生成的cluster size用于anchor size的生成
        all_clusters_sizes, _ = dataset.get_cluster_info()

        # 初始化3d_anchor_generator
        anchor_generator = grid_anchor_3d_generator.GridAnchor3dGenerator()

        # Load indices of data_split
        all_samples = dataset.sample_list

        if indices is None:
            indices = np.arange(len(all_samples))
        num_samples = len(indices)

        # For each image in the dataset, save info on the anchors
        for sample_idx in indices:
            # Get image name for given cluster
            sample_name = all_samples[sample_idx].name
            img_idx = int(sample_name)

            # Check for existing files and skip to the next
            if self._check_for_existing(classes_name, anchor_strides,
                                        sample_name):
                print("{} / {}: Sample already preprocessed".format(
                    sample_idx + 1, num_samples, sample_name))
                continue

            # Get ground truth and filter based on difficulty
            ground_truth_list = obj_utils.read_labels(dataset.label_dir,
                                                      img_idx)

            # Filter objects to dataset classes
            filtered_gt_list = dataset_utils.filter_labels(ground_truth_list)
            filtered_gt_list = np.asarray(filtered_gt_list)

            # Filtering by class has no valid ground truth, skip this image
            if len(filtered_gt_list) == 0:
                print("{} / {} No {}s for sample {} "
                      "(Ground Truth Filter)".format(
                          sample_idx + 1, num_samples,
                          classes_name, sample_name))

                # Output an empty file and move on to the next image.
                self._save_to_file(classes_name, anchor_strides, sample_name)
                continue

            # Get ground plane
            ground_plane = obj_utils.get_road_plane(img_idx,
                                                    dataset.planes_dir)

            image = Image.open(dataset.get_rgb_image_path(sample_name))
            image_shape = [image.size[1], image.size[0]]

            # Generate sliced 2D voxel grid for filtering
            # 生成2d voxel grid，这里只保留了image视角内bev图信息
            vx_grid_2d = dataset_utils.create_sliced_voxel_grid_2d(
                sample_name,
                source=dataset.bev_source,
                image_shape=image_shape)

            # List for merging all anchors
            all_anchor_boxes_3d = []

            # Create anchors for each class
            for class_idx in range(len(dataset.classes)):
                # Generate anchors for all classes
                # 根据不同class的anchor大小以及stride和plane生成3d anchor
                grid_anchor_boxes_3d = anchor_generator.generate(
                    area_3d=self._area_extents,
                    anchor_3d_sizes=all_clusters_sizes[class_idx],
                    anchor_stride=self._anchor_strides[class_idx],
                    ground_plane=ground_plane)

                all_anchor_boxes_3d.extend(grid_anchor_boxes_3d)

            # Filter empty anchors
            all_anchor_boxes_3d = np.asarray(all_anchor_boxes_3d)
            anchors = box_3d_encoder.box_3d_to_anchor(all_anchor_boxes_3d)
            empty_anchor_filter = anchor_filter.get_empty_anchor_filter_2d(
                anchors, vx_grid_2d, self._density_threshold)

            # Calculate anchor info
            # 这里更新了所有anchor和gt的iou信息，以找到anchor匹配的目标target
            anchors_info = self._calculate_anchors_info(
                all_anchor_boxes_3d, empty_anchor_filter, filtered_gt_list)

            anchor_ious = anchors_info[:, self.mini_batch_utils.col_ious]

            valid_iou_indices = np.where(anchor_ious > 0.0)[0]

            print("{} / {}:"
                  "{:>6} anchors, "
                  "{:>6} iou > 0.0, "
                  "for {:>3} {}(s) for sample {}".format(
                      sample_idx + 1, num_samples,
                      len(anchors_info),
                      len(valid_iou_indices),
                      len(filtered_gt_list), classes_name, sample_name
                  ))

            # Save anchors info
            self._save_to_file(classes_name, anchor_strides,
                               sample_name, anchors_info)

其中3D Anchor生成的步骤：

确定Anchor生成范围（area_extents）
根据stride生成anchor的center点分布

生成size和rotation分布->生成anchor matrix

def tile_anchors_3d(area_extents,
                anchor_3d_sizes,
                anchor_stride,
                ground_plane):
"""
Tiles anchors over the area extents by using meshgrids to
generate combinations of (x, y, z), (l, w, h) and ry.
Args:
    area_extents: [[min_x, max_x], [min_y, max_y], [min_z, max_z]]
    anchor_3d_sizes: list of 3d anchor sizes N x (l, w, h)
    anchor_stride: stride lengths (x_stride, z_stride)
    ground_plane: coefficients of the ground plane e.g. [0, -1, 0, 0]

Returns:
    boxes: list of 3D anchors in box_3d format N x [x, y, z, l, w, h, ry]
"""
# Convert sizes to ndarray
# 由于kitti坐标系的原因：x，z轴定义的为地平面坐标系，而y轴对应高度
anchor_3d_sizes = np.asarray(anchor_3d_sizes)

anchor_stride_x = anchor_stride[0]
anchor_stride_z = anchor_stride[1]
anchor_rotations = np.asarray([0, np.pi / 2.0])

x_start = area_extents[0][0] + anchor_stride[0] / 2.0
x_end = area_extents[0][1]
x_centers = np.array(np.arange(x_start, x_end, step=anchor_stride_x),
                     dtype=np.float32)

z_start = area_extents[2][1] - anchor_stride[1] / 2.0
z_end = area_extents[2][0]
z_centers = np.array(np.arange(z_start, z_end, step=-anchor_stride_z),
                     dtype=np.float32)

# Use ranges for substitution
size_indices = np.arange(0, len(anchor_3d_sizes))
rotation_indices = np.arange(0, len(anchor_rotations))

# Generate matrix for substitution
# e.g. for two sizes and two rotations
# [[x0, z0, 0, 0], [x0, z0, 0, 1], [x0, z0, 1, 0], [x0, z0, 1, 1],
#  [x1, z0, 0, 0], [x1, z0, 0, 1], [x1, z0, 1, 0], [x1, z0, 1, 1], ...]
before_sub = np.stack(np.meshgrid(x_centers,
                                  z_centers,
                                  size_indices,
                                  rotation_indices),
                      axis=4).reshape(-1, 4)

# Place anchors on the ground plane
# 利用之前的meshgrid生成anchor的center点
a, b, c, d = ground_plane
all_x = before_sub[:, 0]
all_z = before_sub[:, 1]
all_y = -(a * all_x + c * all_z + d) / b

# Create empty matrix to return
num_anchors = len(before_sub)
all_anchor_boxes_3d = np.zeros((num_anchors, 7))

# Fill in x, y, z
all_anchor_boxes_3d[:, 0:3] = np.stack((all_x, all_y, all_z), axis=1)

# Fill in shapes
sizes = anchor_3d_sizes[np.asarray(before_sub[:, 2], np.int32)]
all_anchor_boxes_3d[:, 3:6] = sizes

# Fill in rotations
rotations = anchor_rotations[np.asarray(before_sub[:, 3], np.int32)]
all_anchor_boxes_3d[:, 6] = rotations

return all_anchor_boxes_3d

模型训练

avod模型的整体结构包括backbone+RPN+avod网络三个部分，详情参照avod_paperreading
backbone采用的是VGG+FPN的结构，但是添加了bev feature的设计(lidar三维数据转化为二维的bev特征)，后与image feature进行融合，RPN网络用于生成region proposal，avod用于最后物体的分类和检测框的回归

调用链

base_dir = avod/
主要的相关模块调用：
config = avod/config/pyramid_cars_with_aug_example.config
scripts/run_training.py->avod/avod/core/trainer.py(这里会完成model，input_data，loss，op等模块的构建)->avod/avod/core/models/avod_model.py->avod/avod/core/models/rpn_model.py

核心部分

数据前处理

训练的数据前处理与前文的预生成数据的区别是这里是对输入的原始数据进行处理，主要分为以下几个部分：

三维点云数据的读取和过滤:

三维点云数据读入后需要进行去除在image视角外的点云数据包括两个部分：ground_plane_filter+image_filter，前者主要用于生成bev图特征（对应不同高度生成不同体素空间，进行点的特征编码，参照bev的生成），后者主要是将对应cam view外的点进行过滤。

BEV图的生成

BEV图生成原理是在过滤后的点云数据上，根据height_lo和height_hi的高度范围（相对于ground_plane）生成num_slices个y轴维度的切片(slices)每个切片上按照voxel_size生成一系列单元（voxel），以其中点云的最高点高度作为feature，最终生成(bev_width/voxel_size)*(bev_height/voxel_size)*(num_slices+1)维特征，+1为记录的density信息，代码如下

#avod/acod/datasets/kitti/kitti_utils.py:109
      def generate_bev(self,
                       source,
                       point_cloud,
                       ground_plane,
                       area_extents,
                       voxel_size):
          """Generates the BEV maps dictionary. One height map is created for
          each slice of the point cloud. One density map is created for
          the whole point cloud.

          Args:
              source: point cloud source
              point_cloud: point cloud (3, N)
              ground_plane: ground plane coefficients
              area_extents: 3D area extents
                  [[min_x, max_x], [min_y, max_y], [min_z, max_z]]
              voxel_size: voxel size in m

          Returns:
              BEV maps dictionary
                  height_maps: list of height maps
                  density_map: density map
          """
          #得到点云数据
          all_points = np.transpose(point_cloud)

          height_maps = []

          for slice_idx in range(self.num_slices):
              height_lo = self.height_lo + slice_idx * self.height_per_division
              height_hi = height_lo + self.height_per_division
              #slice_filter相对ground_plane根据高度进行每个slice点云的过滤
              slice_filter = self.kitti_utils.create_slice_filter(
                  point_cloud,
                  area_extents,
                  ground_plane,
                  height_lo,
                  height_hi)

              # Apply slice filter
              slice_points = all_points[slice_filter]

              if len(slice_points) > 1:

                  # Create Voxel Grid 2D
                  voxel_grid_2d = VoxelGrid2D()
                  voxel_grid_2d.voxelize_2d(
                      slice_points, voxel_size,
                      extents=area_extents,
                      ground_plane=ground_plane,
                      create_leaf_layout=False)

                  # Remove y values (all 0)
                  voxel_indices = voxel_grid_2d.voxel_indices[:, [0, 2]]

              # Create empty BEV images
              height_map = np.zeros((voxel_grid_2d.num_divisions[0],
                                     voxel_grid_2d.num_divisions[2]))

              # Only update pixels where voxels have max height values,
              # and normalize by height of slices
              # 生成含有最大高度信息的height_map
              voxel_grid_2d.heights = voxel_grid_2d.heights - height_lo
              height_map[voxel_indices[:, 0], voxel_indices[:, 1]] = \
                  np.asarray(voxel_grid_2d.heights) / self.height_per_division

              height_maps.append(height_map)

          # Rotate height maps 90 degrees
          # (transpose and flip) is faster than np.rot90
          # 应该是坐标系定义的问题（image和bev）
          height_maps_out = [np.flip(height_maps[map_idx].transpose(), axis=0)
                             for map_idx in range(len(height_maps))]

          #得到density的filter，在全量高度上得到
          density_slice_filter = self.kitti_utils.create_slice_filter(
              point_cloud,
              area_extents,
              ground_plane,
              self.height_lo,
              self.height_hi)

          density_points = all_points[density_slice_filter]

          # Create Voxel Grid 2D
          density_voxel_grid_2d = VoxelGrid2D()
          density_voxel_grid_2d.voxelize_2d(
              density_points,
              voxel_size,
              extents=area_extents,
              ground_plane=ground_plane,
              create_leaf_layout=False)

          # Generate density map
          density_voxel_indices_2d = \
              density_voxel_grid_2d.voxel_indices[:, [0, 2]]

          density_map = self._create_density_map(
              num_divisions=density_voxel_grid_2d.num_divisions,
              voxel_indices_2d=density_voxel_indices_2d,
              num_pts_per_voxel=density_voxel_grid_2d.num_pts_in_voxel,
              norm_value=self.NORM_VALUES[source])

          bev_maps = dict()
          bev_maps['height_maps'] = height_maps_out
          bev_maps['density_map'] = density_map

          return bev_maps

数据增强(data augumentation)

这部分主要是在读入数据的过程中会进行数据的增强操作，默认car的增强操作包括:flipping+pca_jitter。

Backbone

backbone(feature extactor)包括两个部分：bev和image，整体结构类似，具体实现参考下文代码，其结构可以概述为conv1*2->pool1->conv2*2->pool2->conv3*2->pool3->conv4->(upconv3+concat3+fusion3)->(upconv2+concat2+fusion2)->(upconv1+concat1+fusion1)

#avod/core/feature_extractors/bev_vgg_pyramid.py:30
def build(self,
              inputs,
              input_pixel_size,
              is_training,
              scope='bev_vgg_pyr'):
        """ Modified VGG for BEV feature extraction with pyramid features

        Args:
            inputs: a tensor of size [batch_size, height, width, channels].
            input_pixel_size: size of the input (H x W)
            is_training: True for training, False for validation/testing.
            scope: Optional scope for the variables.

        Returns:
            The last op containing the log predictions and end_points dict.
        """
        vgg_config = self.config

        with slim.arg_scope(self.vgg_arg_scope(
                weight_decay=vgg_config.l2_weight_decay)):
            with tf.variable_scope(scope, 'bev_vgg_pyr', [inputs]) as sc:
                end_points_collection = sc.name + '_end_points'

                # Collect outputs for conv2d, fully_connected and max_pool2d.
                with slim.arg_scope([slim.conv2d, slim.max_pool2d],
                                    outputs_collections=end_points_collection):

                    # Pad 700 to 704 to allow even divisions for max pooling
                    padded = tf.pad(inputs, [[0, 0], [4, 0], [0, 0], [0, 0]])

                    # Encoder
                    conv1 = slim.repeat(padded,
                                        vgg_config.vgg_conv1[0],
                                        slim.conv2d,
                                        vgg_config.vgg_conv1[1],
                                        [3, 3],
                                        normalizer_fn=slim.batch_norm,
                                        normalizer_params={
                                            'is_training': is_training},
                                        scope='conv1')
                    pool1 = slim.max_pool2d(conv1, [2, 2], scope='pool1')

                    conv2 = slim.repeat(pool1,
                                        vgg_config.vgg_conv2[0],
                                        slim.conv2d,
                                        vgg_config.vgg_conv2[1],
                                        [3, 3],
                                        normalizer_fn=slim.batch_norm,
                                        normalizer_params={
                                            'is_training': is_training},
                                        scope='conv2')
                    pool2 = slim.max_pool2d(conv2, [2, 2], scope='pool2')

                    conv3 = slim.repeat(pool2,
                                        vgg_config.vgg_conv3[0],
                                        slim.conv2d,
                                        vgg_config.vgg_conv3[1],
                                        [3, 3],
                                        normalizer_fn=slim.batch_norm,
                                        normalizer_params={
                                            'is_training': is_training},
                                        scope='conv3')
                    pool3 = slim.max_pool2d(conv3, [2, 2], scope='pool3')

                    conv4 = slim.repeat(pool3,
                                        vgg_config.vgg_conv4[0],
                                        slim.conv2d,
                                        vgg_config.vgg_conv4[1],
                                        [3, 3],
                                        normalizer_fn=slim.batch_norm,
                                        normalizer_params={
                                            'is_training': is_training},
                                        scope='conv4')

                    # Decoder (upsample and fuse features)
                    upconv3 = slim.conv2d_transpose(
                        conv4,
                        vgg_config.vgg_conv3[1],
                        [3, 3],
                        stride=2,
                        normalizer_fn=slim.batch_norm,
                        normalizer_params={
                            'is_training': is_training},
                        scope='upconv3')

                    concat3 = tf.concat(
                        (conv3, upconv3), axis=3, name='concat3')
                    pyramid_fusion3 = slim.conv2d(
                        concat3,
                        vgg_config.vgg_conv2[1],
                        [3, 3],
                        normalizer_fn=slim.batch_norm,
                        normalizer_params={
                            'is_training': is_training},
                        scope='pyramid_fusion3')

                    upconv2 = slim.conv2d_transpose(
                        pyramid_fusion3,
                        vgg_config.vgg_conv2[1],
                        [3, 3],
                        stride=2,
                        normalizer_fn=slim.batch_norm,
                        normalizer_params={
                            'is_training': is_training},
                        scope='upconv2')

                    concat2 = tf.concat(
                        (conv2, upconv2), axis=3, name='concat2')
                    pyramid_fusion_2 = slim.conv2d(
                        concat2,
                        vgg_config.vgg_conv1[1],
                        [3, 3],
                        normalizer_fn=slim.batch_norm,
                        normalizer_params={
                            'is_training': is_training},
                        scope='pyramid_fusion2')

                    upconv1 = slim.conv2d_transpose(
                        pyramid_fusion_2,
                        vgg_config.vgg_conv1[1],
                        [3, 3],
                        stride=2,
                        normalizer_fn=slim.batch_norm,
                        normalizer_params={
                            'is_training': is_training},
                        scope='upconv1')

                    concat1 = tf.concat(
                        (conv1, upconv1), axis=3, name='concat1')
                    pyramid_fusion1 = slim.conv2d(
                        concat1,
                        vgg_config.vgg_conv1[1],
                        [3, 3],
                        normalizer_fn=slim.batch_norm,
                        normalizer_params={
                            'is_training': is_training},
                        scope='pyramid_fusion1')

                    # Slice off padded area
                    sliced = pyramid_fusion1[:, 4:]

                feature_maps_out = sliced

                # Convert end_points_collection into a end_point dict.
                end_points = slim.utils.convert_collection_to_dict(
                    end_points_collection)

                return feature_maps_out, end_points

RPN Model

Backbone(feature extraction)出来的feature会分别经过一个1*1的卷积(bottle_neck)生成proposal网络的input_feature。默认配置设置了path_drop：image和bev两个path会有一定的几率没有输入，类似于drop_out（具体参考avod/avod/core/models/rpn.py:create_path_drop_masks）。之后会将得到的3d anchor映射到bev图和image图上，前者直接投影到ground_plane上，后者通过lidar坐标和image坐标的映射关系得到（取最大的2d框）。之后根据config中的roi_crop_size将得到的proposal feature进行crop_and_resize到相同尺寸。之后会做特征的fusion（默认采用mean fusion）,fusioned feature会通过两个分支：3层卷积（论文中为fc，实际代码中为convd）组成的objectness和offsets的预测,这样就形成了first stage的proposal，之后proposal一方面会通过top-k的nms（注意这里的nms是所有类共同做的nms结果）作为second stage的输入，另一方面通过gen_mini_batch生成mini-batch（默认为512个samples，正负例各一半）计算objectness和regression loss(smooth l1),值得注意的是这里的是生成mini-batch的方式采用的是random shuffile的方式，即先shuffle一半的正例（256），如果不足的话用负例补充，没有考虑类比不平衡的问题，所以会造成小样本类别物体收敛慢甚至不收敛的问题。其build 网络部分代码如下：

#rpn_model.py:280, deteled some code for summary
def build(self):

        # Setup input placeholders
        self._set_up_input_pls()

        # Setup feature extractors
        self._set_up_feature_extractors()

        bev_proposal_input = self.bev_bottleneck
        img_proposal_input = self.img_bottleneck

        fusion_mean_div_factor = 2.0

        # If both img and bev probabilites are set to 1.0, don't do
        # path drop.
        if not (self._path_drop_probabilities[0] ==
                self._path_drop_probabilities[1] == 1.0):
            with tf.variable_scope('rpn_path_drop'):

                random_values = tf.random_uniform(shape=[3],
                                                  minval=0.0,
                                                  maxval=1.0)

                img_mask, bev_mask = self.create_path_drop_masks(
                    self._path_drop_probabilities[0],
                    self._path_drop_probabilities[1],
                    random_values)

                img_proposal_input = tf.multiply(img_proposal_input,
                                                 img_mask)

                bev_proposal_input = tf.multiply(bev_proposal_input,
                                                 bev_mask)

                self.img_path_drop_mask = img_mask
                self.bev_path_drop_mask = bev_mask

                # Overwrite the division factor
                fusion_mean_div_factor = img_mask + bev_mask

        with tf.variable_scope('proposal_roi_pooling'):

            with tf.variable_scope('box_indices'):
                def get_box_indices(boxes):
                    proposals_shape = boxes.get_shape().as_list()
                    if any(dim is None for dim in proposals_shape):
                        proposals_shape = tf.shape(boxes)
                    ones_mat = tf.ones(proposals_shape[:2], dtype=tf.int32)
                    multiplier = tf.expand_dims(
                        tf.range(start=0, limit=proposals_shape[0]), 1)
                    return tf.reshape(ones_mat * multiplier, [-1])

                bev_boxes_norm_batches = tf.expand_dims(
                    self._bev_anchors_norm_pl, axis=0)

                # These should be all 0's since there is only 1 image
                tf_box_indices = get_box_indices(bev_boxes_norm_batches)

            # Do ROI Pooling on BEV
            bev_proposal_rois = tf.image.crop_and_resize(
                bev_proposal_input,
                self._bev_anchors_norm_pl,
                tf_box_indices,
                self._proposal_roi_crop_size)
            # Do ROI Pooling on image
            img_proposal_rois = tf.image.crop_and_resize(
                img_proposal_input,
                self._img_anchors_norm_pl,
                tf_box_indices,
                self._proposal_roi_crop_size)

        with tf.variable_scope('proposal_roi_fusion'):
            rpn_fusion_out = None
            if self._fusion_method == 'mean':
                tf_features_sum = tf.add(bev_proposal_rois, img_proposal_rois)
                rpn_fusion_out = tf.divide(tf_features_sum,
                                           fusion_mean_div_factor)
            elif self._fusion_method == 'concat':
                rpn_fusion_out = tf.concat(
                    [bev_proposal_rois, img_proposal_rois], axis=3)
            else:
                raise ValueError('Invalid fusion method', self._fusion_method)

        # TODO: move this section into an separate AnchorPredictor class
        with tf.variable_scope('anchor_predictor', 'ap', [rpn_fusion_out]):
            tensor_in = rpn_fusion_out

            # Parse rpn layers config
            layers_config = self._config.layers_config.rpn_config
            l2_weight_decay = layers_config.l2_weight_decay

            if l2_weight_decay > 0:
                weights_regularizer = slim.l2_regularizer(l2_weight_decay)
            else:
                weights_regularizer = None

            with slim.arg_scope([slim.conv2d],
                                weights_regularizer=weights_regularizer):
                # Use conv2d instead of fully_connected layers.
                cls_fc6 = slim.conv2d(tensor_in,
                                      layers_config.cls_fc6,
                                      self._proposal_roi_crop_size,
                                      padding='VALID',
                                      scope='cls_fc6')

                cls_fc6_drop = slim.dropout(cls_fc6,
                                            layers_config.keep_prob,
                                            is_training=self._is_training,
                                            scope='cls_fc6_drop')

                cls_fc7 = slim.conv2d(cls_fc6_drop,
                                      layers_config.cls_fc7,
                                      [1, 1],
                                      scope='cls_fc7')

                cls_fc7_drop = slim.dropout(cls_fc7,
                                            layers_config.keep_prob,
                                            is_training=self._is_training,
                                            scope='cls_fc7_drop')

                cls_fc8 = slim.conv2d(cls_fc7_drop,
                                      2,
                                      [1, 1],
                                      activation_fn=None,
                                      scope='cls_fc8')

                objectness = tf.squeeze(
                    cls_fc8, [1, 2],
                    name='cls_fc8/squeezed')

                # Use conv2d instead of fully_connected layers.
                reg_fc6 = slim.conv2d(tensor_in,
                                      layers_config.reg_fc6,
                                      self._proposal_roi_crop_size,
                                      padding='VALID',
                                      scope='reg_fc6')

                reg_fc6_drop = slim.dropout(reg_fc6,
                                            layers_config.keep_prob,
                                            is_training=self._is_training,
                                            scope='reg_fc6_drop')

                reg_fc7 = slim.conv2d(reg_fc6_drop,
                                      layers_config.reg_fc7,
                                      [1, 1],
                                      scope='reg_fc7')

                reg_fc7_drop = slim.dropout(reg_fc7,
                                            layers_config.keep_prob,
                                            is_training=self._is_training,
                                            scope='reg_fc7_drop')

                reg_fc8 = slim.conv2d(reg_fc7_drop,
                                      6,
                                      [1, 1],
                                      activation_fn=None,
                                      scope='reg_fc8')

                offsets = tf.squeeze(
                    reg_fc8, [1, 2],
                    name='reg_fc8/squeezed')

        # Return the proposals
        with tf.variable_scope('proposals'):
            anchors = self.placeholders[self.PL_ANCHORS]

            # Decode anchor regression offsets
            with tf.variable_scope('decoding'):
                regressed_anchors = anchor_encoder.offset_to_anchor(
                        anchors, offsets)

            with tf.variable_scope('bev_projection'):
                _, bev_proposal_boxes_norm = anchor_projector.project_to_bev(
                    regressed_anchors, self._bev_extents)

            with tf.variable_scope('softmax'):
                objectness_softmax = tf.nn.softmax(objectness)

            with tf.variable_scope('nms'):
                objectness_scores = objectness_softmax[:, 1]

                # Do NMS on regressed anchors
                top_indices = tf.image.non_max_suppression(
                    bev_proposal_boxes_norm, objectness_scores,
                    max_output_size=self._nms_size,
                    iou_threshold=self._nms_iou_thresh)

                top_anchors = tf.gather(regressed_anchors, top_indices)
                top_objectness_softmax = tf.gather(objectness_scores,
                                                   top_indices)
                # top_offsets = tf.gather(offsets, top_indices)
                # top_objectness = tf.gather(objectness, top_indices)

        # Get mini batch
        all_ious_gt = self.placeholders[self.PL_ANCHOR_IOUS]
        all_offsets_gt = self.placeholders[self.PL_ANCHOR_OFFSETS]
        all_classes_gt = self.placeholders[self.PL_ANCHOR_CLASSES]

        with tf.variable_scope('mini_batch'):
            mini_batch_utils = self.dataset.kitti_utils.mini_batch_utils
            mini_batch_mask, _ = \
                mini_batch_utils.sample_rpn_mini_batch(all_ious_gt)


        # Ground Truth Tensors
        with tf.variable_scope('one_hot_classes'):

            # Anchor classification ground truth
            # Object / Not Object
            min_pos_iou = \
                self.dataset.kitti_utils.mini_batch_utils.rpn_pos_iou_range[0]

            objectness_classes_gt = tf.cast(
                tf.greater_equal(all_ious_gt, min_pos_iou),
                dtype=tf.int32)
            objectness_gt = tf.one_hot(
                objectness_classes_gt, depth=2,
                on_value=1.0 - self._config.label_smoothing_epsilon,
                off_value=self._config.label_smoothing_epsilon)

        # Mask predictions for mini batch
        with tf.variable_scope('prediction_mini_batch'):
            objectness_masked = tf.boolean_mask(objectness, mini_batch_mask)
            offsets_masked = tf.boolean_mask(offsets, mini_batch_mask)

        with tf.variable_scope('ground_truth_mini_batch'):
            objectness_gt_masked = tf.boolean_mask(
                objectness_gt, mini_batch_mask)
            offsets_gt_masked = tf.boolean_mask(all_offsets_gt,
                                                mini_batch_mask)

        # Specify the tensors to evaluate
        predictions = dict()

        # Temporary predictions for debugging
        # predictions['anchor_ious'] = anchor_ious
        # predictions['anchor_offsets'] = all_offsets_gt

        if self._train_val_test in ['train', 'val']:
            # All anchors
            predictions[self.PRED_ANCHORS] = anchors

            # Mini-batch masks
            predictions[self.PRED_MB_MASK] = mini_batch_mask
            # Mini-batch predictions
            predictions[self.PRED_MB_OBJECTNESS] = objectness_masked
            predictions[self.PRED_MB_OFFSETS] = offsets_masked

            # Mini batch ground truth
            predictions[self.PRED_MB_OFFSETS_GT] = offsets_gt_masked
            predictions[self.PRED_MB_OBJECTNESS_GT] = objectness_gt_masked

            # Proposals after nms
            predictions[self.PRED_TOP_INDICES] = top_indices
            predictions[self.PRED_TOP_ANCHORS] = top_anchors
            predictions[
                self.PRED_TOP_OBJECTNESS_SOFTMAX] = top_objectness_softmax

        else:
            # self._train_val_test == 'test'
            predictions[self.PRED_TOP_ANCHORS] = top_anchors
            predictions[
                self.PRED_TOP_OBJECTNESS_SOFTMAX] = top_objectness_softmax

        return predictions

AVOD Model

AVOD网络部分会得到first stage得到的top-k anchor proposals,得到对应bev和img的anchor projection，进行相同的crop_and_resize操作，之后再进行fusion+n*(fc+fc_drop)进行cls，offsets以及angle vector的预测（fusion默认采用early-fusion:即先进行fusion再进入之后网络层）。生成prediction之后，会解码gt投影到bev图上,然后采用同样的策略生成mini-batch和top-anchor(bev上进行的nms),并且生成对应的objecness，offset，angle的loss。mini-batch的loss作为train过程中进行模型训练，后者生成最终的预测，但是loss好像并没有使用。其中，offset的loss需要转化到3d box上去计算（论文提出的box_4c计算方式）。相关代码如下:

#avod_model.py:123 deleted code for summary
def build(self):
        rpn_model = self._rpn_model

        # Share the same prediction dict as RPN
        prediction_dict = rpn_model.build()

        top_anchors = prediction_dict[RpnModel.PRED_TOP_ANCHORS]
        ground_plane = rpn_model.placeholders[RpnModel.PL_GROUND_PLANE]

        class_labels = rpn_model.placeholders[RpnModel.PL_LABEL_CLASSES]

        with tf.variable_scope('avod_projection'):

            if self._config.expand_proposals_xz > 0.0:

                expand_length = self._config.expand_proposals_xz

                # Expand anchors along x and z
                with tf.variable_scope('expand_xz'):
                    expanded_dim_x = top_anchors[:, 3] + expand_length
                    expanded_dim_z = top_anchors[:, 5] + expand_length

                    expanded_anchors = tf.stack([
                        top_anchors[:, 0],
                        top_anchors[:, 1],
                        top_anchors[:, 2],
                        expanded_dim_x,
                        top_anchors[:, 4],
                        expanded_dim_z
                    ], axis=1)

                avod_projection_in = expanded_anchors

            else:
                avod_projection_in = top_anchors

            with tf.variable_scope('bev'):
                # Project top anchors into bev and image spaces
                bev_proposal_boxes, bev_proposal_boxes_norm = \
                    anchor_projector.project_to_bev(
                        avod_projection_in,
                        self.dataset.kitti_utils.bev_extents)

                # Reorder projected boxes into [y1, x1, y2, x2]
                bev_proposal_boxes_tf_order = \
                    anchor_projector.reorder_projected_boxes(
                        bev_proposal_boxes)
                bev_proposal_boxes_norm_tf_order = \
                    anchor_projector.reorder_projected_boxes(
                        bev_proposal_boxes_norm)

            with tf.variable_scope('img'):
                image_shape = tf.cast(tf.shape(
                    rpn_model.placeholders[RpnModel.PL_IMG_INPUT])[0:2],
                    tf.float32)
                img_proposal_boxes, img_proposal_boxes_norm = \
                    anchor_projector.tf_project_to_image_space(
                        avod_projection_in,
                        rpn_model.placeholders[RpnModel.PL_CALIB_P2],
                        image_shape)
                # Only reorder the normalized img
                img_proposal_boxes_norm_tf_order = \
                    anchor_projector.reorder_projected_boxes(
                        img_proposal_boxes_norm)

        bev_feature_maps = rpn_model.bev_feature_maps
        img_feature_maps = rpn_model.img_feature_maps

        if not (self._path_drop_probabilities[0] ==
                self._path_drop_probabilities[1] == 1.0):

            with tf.variable_scope('avod_path_drop'):

                img_mask = rpn_model.img_path_drop_mask
                bev_mask = rpn_model.bev_path_drop_mask

                img_feature_maps = tf.multiply(img_feature_maps,
                                               img_mask)

                bev_feature_maps = tf.multiply(bev_feature_maps,
                                               bev_mask)
        else:
            bev_mask = tf.constant(1.0)
            img_mask = tf.constant(1.0)

        # ROI Pooling
        with tf.variable_scope('avod_roi_pooling'):
            def get_box_indices(boxes):
                proposals_shape = boxes.get_shape().as_list()
                if any(dim is None for dim in proposals_shape):
                    proposals_shape = tf.shape(boxes)
                ones_mat = tf.ones(proposals_shape[:2], dtype=tf.int32)
                multiplier = tf.expand_dims(
                    tf.range(start=0, limit=proposals_shape[0]), 1)
                return tf.reshape(ones_mat * multiplier, [-1])

            bev_boxes_norm_batches = tf.expand_dims(
                bev_proposal_boxes_norm, axis=0)

            # These should be all 0's since there is only 1 image
            tf_box_indices = get_box_indices(bev_boxes_norm_batches)

            # Do ROI Pooling on BEV
            bev_rois = tf.image.crop_and_resize(
                bev_feature_maps,
                bev_proposal_boxes_norm_tf_order,
                tf_box_indices,
                self._proposal_roi_crop_size,
                name='bev_rois')
            # Do ROI Pooling on image
            img_rois = tf.image.crop_and_resize(
                img_feature_maps,
                img_proposal_boxes_norm_tf_order,
                tf_box_indices,
                self._proposal_roi_crop_size,
                name='img_rois')

        # Fully connected layers (Box Predictor)
        avod_layers_config = self.model_config.layers_config.avod_config

        fc_output_layers = \
            avod_fc_layers_builder.build(
                layers_config=avod_layers_config,
                input_rois=[bev_rois, img_rois],
                input_weights=[bev_mask, img_mask],
                num_final_classes=self._num_final_classes,
                box_rep=self._box_rep,
                top_anchors=top_anchors,
                ground_plane=ground_plane,
                is_training=self._is_training)

        all_cls_logits = \
            fc_output_layers[avod_fc_layers_builder.KEY_CLS_LOGITS]
        all_offsets = fc_output_layers[avod_fc_layers_builder.KEY_OFFSETS]

        # This may be None
        all_angle_vectors = \
            fc_output_layers.get(avod_fc_layers_builder.KEY_ANGLE_VECTORS)

        with tf.variable_scope('softmax'):
            all_cls_softmax = tf.nn.softmax(
                all_cls_logits)

        ######################################################
        # Subsample mini_batch for the loss function
        ######################################################
        # Get the ground truth tensors
        anchors_gt = rpn_model.placeholders[RpnModel.PL_LABEL_ANCHORS]
        if self._box_rep in ['box_3d', 'box_4ca']:
            boxes_3d_gt = rpn_model.placeholders[RpnModel.PL_LABEL_BOXES_3D]
            orientations_gt = boxes_3d_gt[:, 6]
        elif self._box_rep in ['box_8c', 'box_8co', 'box_4c']:
            boxes_3d_gt = rpn_model.placeholders[RpnModel.PL_LABEL_BOXES_3D]
        else:
            raise NotImplementedError('Ground truth tensors not implemented')

        # Project anchor_gts to 2D bev
        with tf.variable_scope('avod_gt_projection'):
            bev_anchor_boxes_gt, _ = anchor_projector.project_to_bev(
                anchors_gt, self.dataset.kitti_utils.bev_extents)

            bev_anchor_boxes_gt_tf_order = \
                anchor_projector.reorder_projected_boxes(bev_anchor_boxes_gt)

        with tf.variable_scope('avod_box_list'):
            # Convert to box_list format
            anchor_box_list_gt = box_list.BoxList(bev_anchor_boxes_gt_tf_order)
            anchor_box_list = box_list.BoxList(bev_proposal_boxes_tf_order)
        #得到minibatch的mask，label index和对应的匹配到的gt index
        mb_mask, mb_class_label_indices, mb_gt_indices = \
            self.sample_mini_batch(
                anchor_box_list_gt=anchor_box_list_gt,
                anchor_box_list=anchor_box_list,
                class_labels=class_labels)

        # Create classification one_hot vector
        with tf.variable_scope('avod_one_hot_classes'):
            mb_classification_gt = tf.one_hot(
                mb_class_label_indices,
                depth=self._num_final_classes,
                on_value=1.0 - self._config.label_smoothing_epsilon,
                off_value=(self._config.label_smoothing_epsilon /
                           self.dataset.num_classes))

        # TODO: Don't create a mini batch in test mode
        # Mask predictions
        with tf.variable_scope('avod_apply_mb_mask'):
            # Classification
            mb_classifications_logits = tf.boolean_mask(
                all_cls_logits, mb_mask)
            mb_classifications_softmax = tf.boolean_mask(
                all_cls_softmax, mb_mask)

            # Offsets
            mb_offsets = tf.boolean_mask(all_offsets, mb_mask)

            # Angle Vectors
            if all_angle_vectors is not None:
                mb_angle_vectors = tf.boolean_mask(all_angle_vectors, mb_mask)
            else:
                mb_angle_vectors = None

        # Encode anchor offsets
        with tf.variable_scope('avod_encode_mb_anchors'):
            mb_anchors = tf.boolean_mask(top_anchors, mb_mask)

            if self._box_rep == 'box_3d':
                # Gather corresponding ground truth anchors for each mb sample
                mb_anchors_gt = tf.gather(anchors_gt, mb_gt_indices)
                mb_offsets_gt = anchor_encoder.tf_anchor_to_offset(
                    mb_anchors, mb_anchors_gt)

                # Gather corresponding ground truth orientation for each
                # mb sample
                mb_orientations_gt = tf.gather(orientations_gt,
                                               mb_gt_indices)
            elif self._box_rep in ['box_8c', 'box_8co']:

                # Get boxes_3d ground truth mini-batch and convert to box_8c
                mb_boxes_3d_gt = tf.gather(boxes_3d_gt, mb_gt_indices)
                if self._box_rep == 'box_8c':
                    mb_boxes_8c_gt = \
                        box_8c_encoder.tf_box_3d_to_box_8c(mb_boxes_3d_gt)
                elif self._box_rep == 'box_8co':
                    mb_boxes_8c_gt = \
                        box_8c_encoder.tf_box_3d_to_box_8co(mb_boxes_3d_gt)

                # Convert proposals: anchors -> box_3d -> box8c
                proposal_boxes_3d = \
                    box_3d_encoder.anchors_to_box_3d(top_anchors, fix_lw=True)
                proposal_boxes_8c = \
                    box_8c_encoder.tf_box_3d_to_box_8c(proposal_boxes_3d)

                # Get mini batch offsets
                mb_boxes_8c = tf.boolean_mask(proposal_boxes_8c, mb_mask)
                mb_offsets_gt = box_8c_encoder.tf_box_8c_to_offsets(
                    mb_boxes_8c, mb_boxes_8c_gt)

                # Flatten the offsets to a (N x 24) vector
                mb_offsets_gt = tf.reshape(mb_offsets_gt, [-1, 24])

            elif self._box_rep in ['box_4c', 'box_4ca']:

                # Get ground plane for box_4c conversion
                ground_plane = self._rpn_model.placeholders[
                    self._rpn_model.PL_GROUND_PLANE]

                # Convert gt boxes_3d -> box_4c
                mb_boxes_3d_gt = tf.gather(boxes_3d_gt, mb_gt_indices)
                mb_boxes_4c_gt = box_4c_encoder.tf_box_3d_to_box_4c(
                    mb_boxes_3d_gt, ground_plane)

                # Convert proposals: anchors -> box_3d -> box_4c
                proposal_boxes_3d = \
                    box_3d_encoder.anchors_to_box_3d(top_anchors, fix_lw=True)
                proposal_boxes_4c = \
                    box_4c_encoder.tf_box_3d_to_box_4c(proposal_boxes_3d,
                                                       ground_plane)

                # Get mini batch
                mb_boxes_4c = tf.boolean_mask(proposal_boxes_4c, mb_mask)
                mb_offsets_gt = box_4c_encoder.tf_box_4c_to_offsets(
                    mb_boxes_4c, mb_boxes_4c_gt)

                if self._box_rep == 'box_4ca':
                    # Gather corresponding ground truth orientation for each
                    # mb sample
                    mb_orientations_gt = tf.gather(orientations_gt,
                                                   mb_gt_indices)

            else:
                raise NotImplementedError(
                    'Anchor encoding not implemented for', self._box_rep)

        ######################################################
        # Final Predictions
        ######################################################
        # Get orientations from angle vectors
        if all_angle_vectors is not None:
            with tf.variable_scope('avod_orientation'):
                all_orientations = \
                    orientation_encoder.tf_angle_vector_to_orientation(
                        all_angle_vectors)

        # Apply offsets to regress proposals
        with tf.variable_scope('avod_regression'):
            if self._box_rep == 'box_3d':
                prediction_anchors = \
                    anchor_encoder.offset_to_anchor(top_anchors,
                                                    all_offsets)

            elif self._box_rep in ['box_8c', 'box_8co']:
                # Reshape the 24-dim regressed offsets to (N x 3 x 8)
                reshaped_offsets = tf.reshape(all_offsets,
                                              [-1, 3, 8])
                # Given the offsets, get the boxes_8c
                prediction_boxes_8c = \
                    box_8c_encoder.tf_offsets_to_box_8c(proposal_boxes_8c,
                                                        reshaped_offsets)
                # Convert corners back to box3D
                prediction_boxes_3d = \
                    box_8c_encoder.box_8c_to_box_3d(prediction_boxes_8c)

                # Convert the box_3d to anchor format for nms
                prediction_anchors = \
                    box_3d_encoder.tf_box_3d_to_anchor(prediction_boxes_3d)

            elif self._box_rep in ['box_4c', 'box_4ca']:
                # Convert predictions box_4c -> box_3d
                prediction_boxes_4c = \
                    box_4c_encoder.tf_offsets_to_box_4c(proposal_boxes_4c,
                                                        all_offsets)

                prediction_boxes_3d = \
                    box_4c_encoder.tf_box_4c_to_box_3d(prediction_boxes_4c,
                                                       ground_plane)

                # Convert to anchor format for nms
                prediction_anchors = \
                    box_3d_encoder.tf_box_3d_to_anchor(prediction_boxes_3d)

            else:
                raise NotImplementedError('Regression not implemented for',
                                          self._box_rep)

        # Apply Non-oriented NMS in BEV
        with tf.variable_scope('avod_nms'):
            bev_extents = self.dataset.kitti_utils.bev_extents

            with tf.variable_scope('bev_projection'):
                # Project predictions into BEV
                avod_bev_boxes, _ = anchor_projector.project_to_bev(
                    prediction_anchors, bev_extents)
                avod_bev_boxes_tf_order = \
                    anchor_projector.reorder_projected_boxes(
                        avod_bev_boxes)

            # Get top score from second column onward
            all_top_scores = tf.reduce_max(all_cls_logits[:, 1:], axis=1)

            # Apply NMS in BEV
            nms_indices = tf.image.non_max_suppression(
                avod_bev_boxes_tf_order,
                all_top_scores,
                max_output_size=self._nms_size,
                iou_threshold=self._nms_iou_threshold)

            # Gather predictions from NMS indices
            top_classification_logits = tf.gather(all_cls_logits,
                                                  nms_indices)
            top_classification_softmax = tf.gather(all_cls_softmax,
                                                   nms_indices)
            top_prediction_anchors = tf.gather(prediction_anchors,
                                               nms_indices)

            if self._box_rep == 'box_3d':
                top_orientations = tf.gather(
                    all_orientations, nms_indices)

            elif self._box_rep in ['box_8c', 'box_8co']:
                top_prediction_boxes_3d = tf.gather(
                    prediction_boxes_3d, nms_indices)
                top_prediction_boxes_8c = tf.gather(
                    prediction_boxes_8c, nms_indices)

            elif self._box_rep == 'box_4c':
                top_prediction_boxes_3d = tf.gather(
                    prediction_boxes_3d, nms_indices)
                top_prediction_boxes_4c = tf.gather(
                    prediction_boxes_4c, nms_indices)

            elif self._box_rep == 'box_4ca':
                top_prediction_boxes_3d = tf.gather(
                    prediction_boxes_3d, nms_indices)
                top_prediction_boxes_4c = tf.gather(
                    prediction_boxes_4c, nms_indices)
                top_orientations = tf.gather(
                    all_orientations, nms_indices)

            else:
                raise NotImplementedError('NMS gather not implemented for',
                                          self._box_rep)

        if self._train_val_test in ['train', 'val']:
            # Additional entries are added to the shared prediction_dict
            # Mini batch predictions
            prediction_dict[self.PRED_MB_CLASSIFICATION_LOGITS] = \
                mb_classifications_logits
            prediction_dict[self.PRED_MB_CLASSIFICATION_SOFTMAX] = \
                mb_classifications_softmax
            prediction_dict[self.PRED_MB_OFFSETS] = mb_offsets

            # Mini batch ground truth
            prediction_dict[self.PRED_MB_CLASSIFICATIONS_GT] = \
                mb_classification_gt
            prediction_dict[self.PRED_MB_OFFSETS_GT] = mb_offsets_gt

            # Top NMS predictions
            prediction_dict[self.PRED_TOP_CLASSIFICATION_LOGITS] = \
                top_classification_logits
            prediction_dict[self.PRED_TOP_CLASSIFICATION_SOFTMAX] = \
                top_classification_softmax

            prediction_dict[self.PRED_TOP_PREDICTION_ANCHORS] = \
                top_prediction_anchors

            # Mini batch predictions (for debugging)
            prediction_dict[self.PRED_MB_MASK] = mb_mask
            # prediction_dict[self.PRED_MB_POS_MASK] = mb_pos_mask
            prediction_dict[self.PRED_MB_CLASS_INDICES_GT] = \
                mb_class_label_indices

            # All predictions (for debugging)
            prediction_dict[self.PRED_ALL_CLASSIFICATIONS] = \
                all_cls_logits
            prediction_dict[self.PRED_ALL_OFFSETS] = all_offsets

            # Path drop masks (for debugging)
            prediction_dict['bev_mask'] = bev_mask
            prediction_dict['img_mask'] = img_mask

        else:
            # self._train_val_test == 'test'
            prediction_dict[self.PRED_TOP_CLASSIFICATION_SOFTMAX] = \
                top_classification_softmax
            prediction_dict[self.PRED_TOP_PREDICTION_ANCHORS] = \
                top_prediction_anchors

        if self._box_rep == 'box_3d':
            prediction_dict[self.PRED_MB_ANCHORS_GT] = mb_anchors_gt
            prediction_dict[self.PRED_MB_ORIENTATIONS_GT] = mb_orientations_gt
            prediction_dict[self.PRED_MB_ANGLE_VECTORS] = mb_angle_vectors

            prediction_dict[self.PRED_TOP_ORIENTATIONS] = top_orientations

            # For debugging
            prediction_dict[self.PRED_ALL_ANGLE_VECTORS] = all_angle_vectors

        elif self._box_rep in ['box_8c', 'box_8co']:
            prediction_dict[self.PRED_TOP_PREDICTION_BOXES_3D] = \
                top_prediction_boxes_3d

            # Store the corners before converting for visualization purposes
            prediction_dict[self.PRED_TOP_BOXES_8C] = top_prediction_boxes_8c

        elif self._box_rep == 'box_4c':
            prediction_dict[self.PRED_TOP_PREDICTION_BOXES_3D] = \
                top_prediction_boxes_3d
            prediction_dict[self.PRED_TOP_BOXES_4C] = top_prediction_boxes_4c

        elif self._box_rep == 'box_4ca':
            if self._train_val_test in ['train', 'val']:
                prediction_dict[self.PRED_MB_ORIENTATIONS_GT] = \
                    mb_orientations_gt
                prediction_dict[self.PRED_MB_ANGLE_VECTORS] = mb_angle_vectors

            prediction_dict[self.PRED_TOP_PREDICTION_BOXES_3D] = \
                top_prediction_boxes_3d
            prediction_dict[self.PRED_TOP_BOXES_4C] = top_prediction_boxes_4c
            prediction_dict[self.PRED_TOP_ORIENTATIONS] = top_orientations

        else:
            raise NotImplementedError('Prediction dict not implemented for',
                                      self._box_rep)

        # prediction_dict[self.PRED_MAX_IOUS] = max_ious
        # prediction_dict[self.PRED_ALL_IOUS] = all_ious

        return prediction_dict

你可能感兴趣的:(Coding,深度学习,#,源码)

Pytorch 三小时极限入门教程 power-辰南人工智能深度学习 pytorch 人工智能
一、引言在当今的人工智能领域，深度学习占据了举足轻重的地位。而Pytorch作为一款广受欢迎的深度学习框架，以其简洁、灵活的特性，吸引了大量开发者投身其中。无论是科研人员探索前沿的神经网络架构，还是工程师将深度学习技术落地到实际项目，Pytorch都提供了强大的支持。本教程将带你从零基础开始，一步步深入了解Pytorch的核心知识，助你顺利踏上深度学习的征程。二、Pytorch基础环境搭建安装An
Python机器学习之XGBoost从入门到实战(基本理论说明) 雪域枫蓝 Python Atificial Intelligence 机器学习 python 分布式
Xgboost从基础到实战XGBoost:eXtremeGradientBoosting*应用机器学习领域的一个强有力的工具*GradientBootingMachines(GBM)的优化表现，快速有效—深盟分布式机器学习开源平台(DistributedmachinelearningCommunity，DMLC)的分支—DMLC也开源流行的深度学习库mxnet*GBM：Machine：机器学习模型
【YOLOv8杂草作物目标检测】 stsdddd YOLO目标检测目标检测 YOLO 目标检测人工智能
YOLOv8杂草目标检测算法介绍模型和数据集下载算法介绍YOLOv8在禾本科杂草目标检测方面有显著的应用和效果。以下是一些关键信息的总结：农作物幼苗与杂草检测系统：基于YOLOv8深度学习框架，通过2822张图片训练了一个目标检测模型，用于检测田间的农作物幼苗与杂草对象。该系统支持图片、视频以及摄像头进行目标检测，并能保存检测结果。系统界面可实时显示目标位置、目标总数、置信度、用时等信息。YOLO
pyinstaller 打包生成.exe 可执行文件报错 “IndexError: tuple index out of range” 静妍 Python Python pyqt gui Pyinstaller .exe
想把pyqt写的GUI程序打包成.exe文件，以便在Windows下运行，不想因为使用Python3.6，出现兼容问题：IndexError:tupleindexoutofrangePyinstaller官网目前的版本是3.2.1只支持到Python2.7，Python3.3~Python3.5需自己在官网源码里
1.Spring AI 从入门到实践 laopeng301 Spring AI spring 人工智能 java
SpringAI从入门到实践1.什么是SpringAI2.使用SpringBoot&SpringAI快速构建AI应用程序3.ChatClient&ChatModel简化与AI模型的交互4.SpringAIPrompt:与大模型进行有效沟通5.结构化输出大模型响应6.实战:AI聊天机器人Ben技术站关注Java技术，LLM，计算机科学等内容。关注会持续更新推送详细教程内容和源码。
深度学习(1) 浅忆へ梦微凉深度学习人工智能深度学习学习方法 python
一、torch的安装基于直接设备情况，选择合适的torch版本，有显卡的建议安装GPU版本，可以通过nvidia-smi命令来查看显卡驱动的版本，在官网中根据cuda版本，选择合适的版本号，下面是安装示例代码GPU：pipinstalltorch==2.5.0torchvision==0.20.0torchaudio==2.5.0--index-urlhttps://download.pytorc
nlp培训重点-3 heine162 自然语言处理人工智能
1.文本匹配分类：loader:#-*-coding:utf-8-*-importjsonimportreimportosimporttorchimportrandomimportloggingfromtorch.utils.dataimportDataset,DataLoaderfromcollectionsimportdefaultdictfromtransformersimportBertT
深度学习常用格式转化脚本xml2yolo/coco2yolo/bdd2yolo/frame2video等 qq1309399183 计算机视觉实战项目集合深度学习人工智能格式转化脚本 voc2yolo格式转化数据集格式转换 xml2yolo coco2yolo
文章目录1.**数据集格式转换脚本**`coco2yolo.py`示例注释：注释说明：`xml2yolo.py`示例注释：注释说明：2.**数据集可视化与统计**`vis_yolo_files.py`示例注释：注释说明：3.**其他工具脚本**`frames2video.py`示例注释：注释说明：该项目提供了一系列用于深度学习的数据处理工具，主要功能包括：数据集格式转换：提供多种脚本，将不同格式的
LLMs，即大型语言模型 maopig AI 语言模型人工智能自然语言处理
LLMs，即大型语言模型，是一类基于深度学习的人工智能模型，它们通过海量的数据和大量的计算资源进行训练，可以理解和生成自然语言。LLMs的核心架构是Transformer，其关键在于自注意力机制，使得模型能够同时对输入的所有位置进行“关注”，从而更好地捕捉长距离的语义依赖关系。LLMs在众多领域都有广泛的应用，如自然语言理解（NLU），语言生成，以及语音识别和合成等。例如，它们能够理解人类的语言
【第十章——数据可视化之地图构建】【最新！黑马程序员Python自学课程笔记】课上笔记+案例源码+作业源码嗯哈！信息可视化 python 笔记 pycharm
第十章-数据可视化之地图构建10.1数据可视化-地图-基础地图使用注意！！！现在的版本，需要加：省，市"""演示地图可视化的基本使用"""frompyecharts.chartsimportMapfrompyecharts.optionsimportVisualMapOpts#准备地图对象map=Map()#准备数据data=[("北京市",9),("上海市",8),("湖南省",5),("台湾省
【LLM】大语言模型（LLMs）林九生人工智能语言模型人工智能自然语言处理
大型语言模型（LLMs）1.什么是大型语言模型？大型语言模型（LargeLanguageModel，LLM）是基于深度学习的自然语言处理模型，能够理解和生成自然语言文本。它们通过在大规模文本数据上进行训练，学习语言的语法、语义和各种语言特征，从而可以执行诸如文本生成、翻译、总结、问答等多种语言任务。以下是大型语言模型的定义和基本原理：1.1定义大型语言模型是由大量参数组成的神经网络，这些参数通过在
华为OD机试E卷 --跳格子3 --24年OD统一考试（Java & JS & Python & C & C++）飞码创造者最新华为OD机试题库2024 华为od java javascript python c语言
文章目录题目描述输入描述输出描述用例题目解析JS算法源码Java算法源码python算法源码c++算法源码题目描述小明和朋友们一起玩跳格子游戏，每个格子上有特定的分数score=[1,-1,-6,7,-17,7]，从起点score[0]开始，每次最大的步长为k，请你返回小明跳到终点score[n-1]时，能得到的最大得分。输入描述第一行输入总的格子数量n第二行输入每个格子的分数score[i]第三
【Python】已解决：ModuleNotFoundError: No module named ‘sklearn‘ 屿小夏 python sklearn 人工智能
个人简介：某不知名博主，致力于全栈领域的优质博客分享|用最优质的内容带来最舒适的阅读体验！文末获取免费IT学习资料！文末获取更多信息精彩专栏推荐订阅收藏专栏系列直达链接相关介绍书籍分享点我跳转书籍作为获取知识的重要途径，对于IT从业者来说更是不可或缺的资源。不定期更新IT图书，并在评论区抽取随机粉丝，书籍免费包邮到家AI前沿点我跳转探讨人工智能技术领域的最新发展和创新，涵盖机器学习、深度学习、自然
【Bluedroid】HFP连接流程源码分析（一） byte轻骑兵解读 Android java C++Android
Bluedroid蓝牙HFP（HFP,Hands-FreeProfile）连接流程涵盖多个环节，从前期准备到连接建立、状态管理以及维护与断开，各环节紧密相扣，确保蓝牙免提连接稳定可靠。一、概述1.1.连接前准备用户操作：用户需在Android设备上开启蓝牙功能。同时，目标蓝牙设备（如车载蓝牙）要进入配对模式，Android设备通过搜索发现目标设备并完成配对，此过程可能需用户输入PIN码或确认配对请
如何快速在Windows 10 + Anaconda 3 中使用Mxnet及gluon qianchess mxnet使用 mxnet win10 anaconda gluon 人工智能
如何快速在Windows10+Anaconda3中使用Mxnet及gluon网络上Mxnet的安装以及使用方法很多，自从其作者之一李沐推出了基于Mxnet的深度学习课程之后，我也尝试着去使用了一下Mxnet。首先第一步就是在自己的系统中安装Mxnet及其相关组建。现在的Mxnet常常会跟其虚拟环境Gluon结合在一起，所以下文就一起阐述一下，顺便记录一下自己踩的坑。注意本文的大部分内容都可以在官网
3D UNet和Swin-UNETR 学無芷境计算机视觉
3DUNet和Swin-UNETR都是用于医学图像分析的深度学习网络，它们对三维（3D）数据进行特征提取和分割。3DUNet3DUNet是UNet架构的一个变体，专门设计用于处理三维医学图像数据。UNet最初是为二维（2D）图像分割任务设计的，具有典型的编码器-解码器结构。3DUNet扩展了这种架构，以便更好地处理具有深度信息的体积数据，如CT或MRI扫描。主要特点：编码器：逐渐下采样图像，提取并
ubuntu18.04安装grpc及使用grpc时遇到的问题总结烟酒僧_
#安装pkg-configsudoapt-getinstallpkg-config#安装依赖文件sudoapt-getinstallautoconfautomakelibtoolmakeg++unzipsudoapt-getinstalllibgflags-devlibgtest-devsudoapt-getinstallclanglibc++-dev克隆grpc源码gitclonehttps:/
推荐3D UNet实现：深度学习3D体素数据语义分割的利器！滑辰煦Marc
推荐3DUNet实现：深度学习3D体素数据语义分割的利器！去发现同类优质开源项目:https://gitcode.com/在这个快速发展的深度学习时代，3DUNet已经成为3D图像处理领域中不可或缺的工具，尤其在医疗影像分析和3D物体识别等任务上展现出强大的潜力。这个开源项目为我们提供了一个高效、灵活的3DUNet实现，支持Tensorflow、PyTorch和Chainer三种主流深度学习框架。
Spring-@Configuration注解简析
大家好，我是半夏之沫一名金融科技领域的JAVA系统研发我希望将自己工作和学习中的经验以最朴实，最严谨的方式分享给大家，共同进步写作不易，期待大家的关注和点赞关注微信公众号【技术探界】前言Spring中的@Configuration注解修饰的类被称为配置类，通过配置类可以向容器注册bean以及导入其它配置类，本篇文章将结合例子和源码对@Configuration注解原理进行学习，并引出对Spring
百万架构师第二十二课：源码分析：Spring 源码分析：Spring经典面试答疑｜JavaGuide 后端
Spring面试解答上半节：面试中需要注意的细节动脑子，面试是一种交流面试的时候，要用心去感受当时面试场景了解自己，自己的长处、自己的短处（巧妙地扬长避短）了解1.公司的业务场景2.你是去面试什么岗位的？Java高级工程师实际工作经验是1年（如实填写）1、请描述SpringIOC的工作原理答：定位加载注册BeanFactoryBeanDefintion...1-3年1+ApplicationCon
锐捷路由器网关RG-NBR6135-E和锐捷交换机 Ruijie Reyee RG-ES224GC 电脑登录web方法 zh7314 硬件工程
2025年1月17日22:29:35最近淘了点东西，准备在家里搞一套深度学习的服务器，先把网关和交换机搞到了锐捷路由器网关RG-NBR6135-E电脑登录web方法在拿到机器的时候，如果不是全新建议拿根牙签，差入reset5-10秒,灯光会全部闪几下，重置机器，因为有些机器会配置的ip和网段无法访问默认的web服务ip，在机器上面的默认配置单配置参考：https://baijiahao.baidu
华为OD机试E卷 --堆栈中的剩余数字--24年OD统一考试（Java & JS & Python & C & C++）飞码创造者最新华为OD机试题库2024 java 华为od javascript python js c语言
文章目录题目描述输入描述输出描述用例题目解析JS算法源码Java算法源码python算法源码c算法源码题目描述向一个空栈中依次存入正整数，假设入栈元素n(1<=n<=2^31-1)按顺序依次为nx…n4、n3、n2、n1,每当元素入栈时，如果n1=n2+…+ny(y的范围[2,x]，1<=x<=1000)，则n1~ny全部元素出栈，重新入栈新元素m(m=2n1)。如：依次向栈存入6、1、2、3,当
华为OD机试E卷 --机器人活动区域--24年OD统一考试（Java & JS & Python & C & C++）飞码创造者最新华为OD机试题库2024 华为od 机器人 java javascript python js
文章目录题目描述输入描述输出描述用例题目解析JS算法源码Java算法源码python算法源码c算法源码c++算法源码题目描述现有一个机器人，可放置于M×N的网格Q中任意位置，每个网格包含一个非负整数编号。当相邻网格的数字编号差值的绝对值小于等于1时，机器人可在网格间移动问题:求机器人可活动的最大范围对应的网格点数目。说明:1)网格左上角坐标为(0,0)，右下角坐标为(m-1,n-1)2）机器人只能
opencv图像基础学习 yzx991013 OpenCV基础全集 opencv 人工智能计算机视觉
2.3图像的加密解密源码如下：importcv2importnumpyasnpimportmatplotlib.pyplotaspltdefpassImg():img=cv2.imread('./image/cat.jpg',0)h,w=img.shape#生成一个密码，加密key_img=np.random.randint(0,256,size=(h,w),dtype=np.uint8)img_
PyTorch机器学习与深度学习技术方法 Teacher.chenchong 机器学习 python 开发语言
近年来，随着AlphaGo、无人驾驶汽车、医学影像智慧辅助诊疗、ImageNet竞赛等热点事件的发生，人工智能迎来了新一轮的发展浪潮。尤其是深度学习技术，在许多行业都取得了颠覆性的成果。另外，近年来，Pytorch深度学习框架受到越来越多科研人员的关注和喜爱。Python基础知识串讲1、Python环境搭建（Python软件下载、安装与版本选择；PyCharm下载、安装；Python之HelloW
深度学习模块C2f代码详解你是狒狒吗目标检测人工智能计算机视觉 pytorch YOLO 神经网络
C2f是一个用于构建卷积神经网络（CNN）的模块，特别是在YOLOv5和YOLOv8等目标检测模型中。这个模块是一个改进的CSP（CrossStagePartial）Bottleneck结构，旨在提高计算效率和特征提取能力。下面是对C2f类的详细解释：类定义和初始化Python复制classC2f(nn.Module):“”“FasterImplementationofCSPBottleneckw
python多线程锁_python:线程，多线程锁，多线程递归锁八亿中产 python多线程锁
#!usr/bin/envpython#-*-coding:utf-8-*-__author__="Samson"importthreading,timedefrun(n):print("task",n)time.sleep(2)print("currentthread:",threading.current_thread())#当前线程t_obj=[]#存线程实例start_time=time.
android原生乐视made,乐视Pro3 lineage16 安卓9.0 极致省电纯净原生完美root Xposed 经典版... 小6加油 android原生乐视made
乐视系列可刷上lineageos16，再次开启享受类原生的乐趣。乐视Max2和Pro3支持PT项目，也就是说必须刷入支持PT版本TWRP后Vendor分区才可以正常启动LOS16.0特色介绍源于lineage16.0最新源码制作，稳定靠谱默认添加开机语音中文，时区为正常北京超级纯净，非常流畅。它有电话、信息、相机、时钟、录音录屏、邮件、文件管理器和音乐播放器等几个最基本的功能，无谷歌服务和全家桶l
华为 Ascend 平台 YOLOv5 目标检测推理教程 Lunar* 目标检测华为 YOLO 目标检测
1.背景介绍随着人工智能技术的快速发展，目标检测在智能安防、自动驾驶、工业检测等领域中扮演了重要角色。YOLOv5是一种高效的目标检测模型，凭借其速度和精度的平衡广受欢迎。华为Ascend推理框架（ACL）是AscendCANN软件栈的核心组件，专为AscendAI加速硬件（如Atlas300I）设计，可实现高性能的深度学习推理。在本文中，我们将介绍如何基于华为AscendACL推理框架对YOLO
3、C#基于.net framework的应用开发实战编程 - 实现（三、一） - 编程手把手系列文章... lzhdim c#.net oracle 开发语言数据库
三、实现；三．一、实现数据库操作；对于数据库的操作，以前都是有ODBC的接口，通过Helper类库进行的操作。此文主要介绍例子里对数据库操作的实现。1、SQLiteHelper；SQLite主要是用C编写的，但是对于C#来说提供了类库，但是还需要Helper类来进行高层次的处理。这个类库来源于网络，具体实现请自己阅读例子中的源码。2、SQL语句；例子的中的SQL语句在设计的时候数据表的操作都罗列了
apache 安装linux windows 墙头上一根草 apache inux windows
linux安装Apache 有两种方式一种是手动安装通过二进制的文件进行安装，另外一种就是通过yum 安装，此中安装方式，需要物理机联网。以下分别介绍两种的安装方式通过二进制文件安装Apache需要的软件有apr,apr-util,pcre 1，安装 apr 下载地址：htt
fill_parent、wrap_content和match_parent的区别 Cb123456 match_parent fill_parent
fill_parent、wrap_content和match_parent的区别: 1）fill_parent 设置一个构件的布局为fill_parent将强制性地使构件扩展，以填充布局单元内尽可能多的空间。这跟Windows控件的dockstyle属性大体一致。设置一个顶部布局或控件为fill_parent将强制性让它布满整个屏幕。 2） wrap_conte
网页自适应设计天子之骄 html css 响应式设计页面自适应
网页自适应设计网页对浏览器窗口的自适应支持变得越来越重要了。自适应响应设计更是异常火爆。再加上移动端的崛起，更是如日中天。以前为了适应不同屏幕分布率和浏览器窗口的扩大和缩小，需要设计几套css样式，用js脚本判断窗口大小，选择加载。结构臃肿，加载负担较大。现笔者经过一定时间的学习，有所心得，故分享于此，加强交流，共同进步。同时希望对大家有所
[sql server] 分组取最大最小常用sql 一炮送你回车库 SQL Server
--分组取最大最小常用sql--测试环境if OBJECT_ID('tb') is not null drop table tb;gocreate table tb( col1 int, col2 int, Fcount int)insert into tbselect 11,20,1 union allselect 11,22,1 union allselect 1
ImageIO写图片输出到硬盘 3213213333332132 java image
package awt; import java.awt.Color; import java.awt.Font; import java.awt.Graphics; import java.awt.image.BufferedImage; import java.io.File; import java.io.IOException; import javax.imagei
自己的String动态数组宝剑锋梅花香 java 动态数组数组
数组还是好说，学过一两门编程语言的就知道，需要注意的是数组声明时需要把大小给它定下来，比如声明一个字符串类型的数组：String str[]=new String[10]; 但是问题就来了，每次都是大小确定的数组，我需要数组大小不固定随时变化怎么办呢？动态数组就这样应运而生，龙哥给我们讲的是自己用代码写动态数组，并非用的ArrayList 看看字符
pinyin4j工具类 darkranger .net
pinyin4j工具类Java工具类 2010-04-24 00:47:00 阅读69 评论0 字号：大中小引入pinyin4j-2.5.0.jar包: pinyin4j是一个功能强悍的汉语拼音工具包，主要是从汉语获取各种格式和需求的拼音，功能强悍，下面看看如何使用pinyin4j。本人以前用AscII编码提取工具，效果不理想，现在用pinyin4j简单实现了一个。功能还不是很完美，
StarUML学习笔记----基本概念 aijuans UML建模
介绍StarUML的基本概念，这些都是有效运用StarUML?所需要的。包括对模型、视图、图、项目、单元、方法、框架、模型块及其差异以及UML轮廓。模型、视与图（Model, View and Diagram） &
Activiti最终总结 avords Activiti id 工作流
1、流程定义ID：ProcessDefinitionId，当定义一个流程就会产生。 2、流程实例ID：ProcessInstanceId，当开始一个具体的流程时就会产生，也就是不同的流程实例ID可能有相同的流程定义ID。 3、TaskId，每一个userTask都会有一个Id这个是存在于流程实例上的。 4、TaskDefinitionKey和（ActivityImpl activityId
从省市区多重级联想到的，react和jquery的差别 bee1314 jquery UI react
在我们的前端项目里经常会用到级联的select，比如省市区这样。通常这种级联大多是动态的。比如先加载了省，点击省加载市，点击市加载区。然后数据通常ajax返回。如果没有数据则说明到了叶子节点。针对这种场景，如果我们使用jquery来实现，要考虑很多的问题，数据部分，以及大量的dom操作。比如这个页面上显示了某个区，这时候我切换省，要把市重新初始化数据，然后区域的部分要从页面
Eclipse快捷键大全 bijian1013 java eclipse 快捷键
Ctrl+1 快速修复(最经典的快捷键,就不用多说了)Ctrl+D: 删除当前行 Ctrl+Alt+↓ 复制当前行到下一行(复制增加)Ctrl+Alt+↑ 复制当前行到上一行(复制增加)Alt+↓ 当前行和下面一行交互位置(特别实用,可以省去先剪切,再粘贴了)Alt+↑ 当前行和上面一行交互位置(同上)Alt+← 前一个编辑的页面Alt+→ 下一个编辑的页面(当然是针对上面那条来说了)Alt+En
js 笔记函数征客丶 JavaScript
一、函数的使用 1.1、定义函数变量 var vName = funcation(params){ } 1.2、函数的调用函数变量的调用： vName(params); 函数定义时自发调用：(function(params){})(params); 1.3、函数中变量赋值 var a = 'a'; var ff
【Scala四】分析Spark源代码总结的Scala语法二 bit1129 scala
1. Some操作在下面的代码中，使用了Some操作：if (self.partitioner == Some(partitioner))，那么Some(partitioner)表示什么含义？首先partitioner是方法combineByKey传入的变量， Some的文档说明： /** Class `Some[A]` represents existin
java 匿名内部类 BlueSkator java匿名内部类
组合优先于继承 Java的匿名类，就是提供了一个快捷方便的手段，令继承关系可以方便地变成组合关系继承只有一个时候才能用，当你要求子类的实例可以替代父类实例的位置时才可以用继承。在Java中内部类主要分为成员内部类、局部内部类、匿名内部类、静态内部类。内部类不是很好理解，但说白了其实也就是一个类中还包含着另外一个类如同一个人是由大脑、肢体、器官等身体结果组成，而内部类相
盗版win装在MAC有害发热，苹果的东西不值得买，win应该不用 ljy325 游戏 apple windows XP OS
Mac mini 型号: MC270CH-A RMB:5,688 Apple 对windows的产品支持不好,有以下问题: 1.装完了xp,发现机身很热虽然没有运行任何程序！貌似显卡跑游戏发热一样，按照那样的发热量,那部机子损耗很大,使用寿命受到严重的影响! 2.反观安装了Mac os的展示机，发热量很小，运行了1天温度也没有那么高 &nbs
读《研磨设计模式》-代码笔记-生成器模式-Builder bylijinnan java 设计模式
声明：本文只为方便我个人查阅和理解，详细的分析以及源代码请移步原作者的博客http://chjavach.iteye.com/ /** * 生成器模式的意图在于将一个复杂的构建与其表示相分离，使得同样的构建过程可以创建不同的表示（GoF） * 个人理解： * 构建一个复杂的对象，对于创建者（Builder）来说，一是要有数据来源(rawData)，二是要返回构
JIRA与SVN插件安装 chenyu19891124 SVN jira
JIRA安装好后提交代码并要显示在JIRA上，这得需要用SVN的插件才能看见开发人员提交的代码。 1.下载svn与jira插件安装包，解压后在安装包(atlassian-jira-subversion-plugin-0.10.1) 2.解压出来的包里下的lib文件夹下的jar拷贝到(C:\Program Files\Atlassian\JIRA 4.3.4\atlassian-jira\WEB
常用数学思想方法 comsci 工作
对于搞工程和技术的朋友来讲，在工作中常常遇到一些实际问题，而采用常规的思维方式无法很好的解决这些问题，那么这个时候我们就需要用数学语言和数学工具，而使用数学工具的前提却是用数学思想的方法来描述问题。。下面转帖几种常用的数学思想方法，仅供学习和参考函数思想　　把某一数学问题用函数表示出来，并且利用函数探究这个问题的一般规律。这是最基本、最常用的数学方法
pl/sql集合类型 daizj oracle 集合 type pl/sql
--集合类型 /* 单行单列的数据，使用标量变量单行多列数据，使用记录单列多行数据，使用集合（。。。） *集合：类似于数组也就是。pl/sql集合类型包括索引表（pl/sql table）、嵌套表（Nested Table）、变长数组（VARRAY）等 */ /* --集合方法 &n
[Ofbiz]ofbiz初用 dinguangx 电商 ofbiz
从github下载最新的ofbiz（截止2015-7-13），从源码进行ofbiz的试用 1. 加载测试库 ofbiz内置derby，通过下面的命令初始化测试库 ./ant load-demo (与load-seed有一些区别) 2. 启动内置tomcat ./ant start 或 ./startofbiz.sh 或 java -jar ofbiz.jar &
结构体中最后一个元素是长度为0的数组 dcj3sjt126com c gcc
在Linux源代码中，有很多的结构体最后都定义了一个元素个数为0个的数组，如/usr/include/linux/if_pppox.h中有这样一个结构体： struct pppoe_tag { __u16 tag_type; __u16 tag_len; &n
Linux cp 实现强行覆盖 dcj3sjt126com linux
发现在Fedora 10 /ubutun 里面用cp -fr src dest，即使加了-f也是不能强行覆盖的，这时怎么回事的呢？一两个文件还好说，就输几个yes吧，但是要是n多文件怎么办，那还不输死人呢？下面提供三种解决办法。方法一我们输入alias命令，看看系统给cp起了一个什么别名。 [root@localhost ~]# aliasalias cp=’cp -i’a
Memcached(一)、HelloWorld frank1234 memcached
一、简介高性能的架构离不开缓存，分布式缓存中的佼佼者当属memcached，它通过客户端将不同的key hash到不同的memcached服务器中，而获取的时候也到相同的服务器中获取，由于不需要做集群同步，也就省去了集群间同步的开销和延迟，所以它相对于ehcache等缓存来说能更好的支持分布式应用，具有更强的横向伸缩能力。二、客户端选择一个memcached客户端，我这里用的是memc
Search in Rotated Sorted Array II hcx2013 search
Follow up for "Search in Rotated Sorted Array":What if duplicates are allowed? Would this affect the run-time complexity? How and why? Write a function to determine if a given ta
Spring4新特性——更好的Java泛型操作API jinnianshilongnian spring4 generic type
Spring4新特性——泛型限定式依赖注入 Spring4新特性——核心容器的其他改进 Spring4新特性——Web开发的增强 Spring4新特性——集成Bean Validation 1.1(JSR-349)到SpringMVC Spring4新特性——Groovy Bean定义DSL Spring4新特性——更好的Java泛型操作API Spring4新
CentOS安装JDK liuxingguome centos
1、行卸载原来的： [root@localhost opt]# rpm -qa | grep java tzdata-java-2014g-1.el6.noarch java-1.7.0-openjdk-1.7.0.65-2.5.1.2.el6_5.x86_64 java-1.6.0-openjdk-1.6.0.0-11.1.13.4.el6.x86_64 [root@localhost
二分搜索专题2-在有序二维数组中搜索一个元素 OpenMind 二维数组算法二分搜索
1,设二维数组p的每行每列都按照下标递增的顺序递增。用数学语言描述如下：p满足 (1),对任意的x1，x2，y，如果x1<x2,则p(x1,y)<p(x2,y); (2),对任意的x，y1,y2, 如果y1<y2,则p(x,y1)<p(x,y2); 2,问题：给定满足1的数组p和一个整数k，求是否存在x0,y0使得p(x0,y0)=k? 3,算法分析： (
java 随机数 Math与Random SaraWon java Math Random
今天需要在程序中产生随机数，知道有两种方法可以使用，但是使用Math和Random的区别还不是特别清楚，看到一篇文章是关于的，觉得写的还挺不错的，原文地址是 http://www.oschina.net/question/157182_45274?sort=default&p=1#answers 产生1到10之间的随机数的两种实现方式： //Math Math.roun
oracle创建表空间 tugn oracle
create temporary tablespace TXSJ_TEMP tempfile 'E:\Oracle\oradata\TXSJ_TEMP.dbf' size 32m autoextend on next 32m maxsize 2048m extent m
使用Java8实现自己的个性化搜索引擎 yangshangchuan java superword 搜索引擎 java8 全文检索
需要对249本软件著作实现句子级别全文检索，这些著作均为PDF文件，不使用现有的框架如lucene，自己实现的方法如下： 1、从PDF文件中提取文本，这里的重点是如何最大可能地还原文本。提取之后的文本，一个句子一行保存为文本文件。 2、将所有文本文件合并为一个单一的文本文件，这样，每一个句子就有一个唯一行号。 3、对每一行文本进行分词，建立倒排表，倒排表的格式为：词=包含该词的总行数N=行号

avod_源码记录

AVOD_源码记录

Table of Contents

AVOD代码框架

代码细节

预生成数据

调用链

核心部分

模型训练

调用链

核心部分

你可能感兴趣的:(Coding,深度学习,#,源码)