忘情摆渡

object_detection源码解析-box_list

models.research.object_detection源码解析—`core.box_list`

box_list是一个ObjectDetection项目中，一个综合管理bounding box的工具库，我们把下面的讲解分成两个主要的方面。

box_list.BoxList，管理box的类。每个BBox必须包括4个数据，[y_min, x_min, y_max, x_max]分别对应左上角的坐标和右下角的坐标，这个坐标可以是相对的，也可以是绝对的。
box_list_ops.py对于box操作的所有的op，包括合并、拆分、剪枝、计算IOU等操作。

BoxList

这个类里面最重要的属性是data, 它可以有很多的字段，当然要有一个最重要的字段–boxes，用来表示Box的数据，这个box是一个rank=2并且最后一维4个数，即[N, 4]的shape。

下面就挑选一些关键的方法进行介绍。

在具体介绍这个相关的方法之前，先介绍一个在这些代码中经常出现的一段代码。

y_min, x_min, y_max, x_max = tf.split(
	value=boxlist.get(), num_or_size_splits=4, axis=1)

关于tf.split的用法大家可以参考官方文档

1. 初始化

def __init__(self, boxes):
    """Constructs box collection.

    Args:
      boxes: a tensor of shape [N, 4] representing box corners

    Raises:
      ValueError: if invalid dimensions for bbox data or if bbox data is not in float32 format.
    """
    if len(boxes.get_shape()) != 2 or boxes.get_shape()[-1] != 4:
        raise ValueError('Invalid dimensions for box data.')
    if boxes.dtype != tf.float32:
        raise ValueError('Invalid tensor type: should be tf.float32')
    self.data = {'boxes': boxes}

这个地方相对好理解一些，首先要检查输入boxes的形状以及数据类型，然后就是形成data属性。

2. get_all_fields

这个方法返回data中所有的字段名称，一般情况下，Boxlist除了维度一个boxes字段还会维护一个scores字段，这个地段用来表示当前box属于物体的置信度。

因为data属性属于dict类型，所以他的操作也是很简单，直接返回self.data.keys()就行。

3. get，set

这个类里面包括两种get和set。

get()、set()，用来获取当前对象的boxes数据和设置当前对象的boxes数据。
get_field()、set_field()，设置和获取某一个具体的field数据, 前面的只是后者的一种特殊情况。

4. get_center_coordinates_and_sizes

def get_center_coordinates_and_sizes(self, scope=None):
    """Computes the center coordinates, height and width of the boxes.

    Args:
      scope: name scope of the function.

    Returns:
      a list of 4 1-D tensors [ycenter, xcenter, height, width].
    """
    with tf.name_scope(scope, 'get_center_coordinates_and_sizes'):
        box_corners = self.get()
        ymin, xmin, ymax, xmax = tf.unstack(tf.transpose(box_corners))
        width = xmax - xmin
        height = ymax - ymin
        ycenter = ymin + height / 2.
        xcenter = xmin + width / 2.
        return [ycenter, xcenter, height, width]

这个方法是获得bbox的中心和宽高，因为输入的bbox的数据格式是[y_min, x_min, y_max, x_max]。

5. transpose_coordinates

def transpose_coordinates(self, scope=None):
    """Transpose the coordinate representation in a boxlist.

    Args:
      scope: name scope of the function.
    """
    with tf.name_scope(scope, 'transpose_coordinates'):
        y_min, x_min, y_max, x_max = tf.split(
            value=self.get(), num_or_size_splits=4, axis=1)
        self.set(tf.concat([x_min, y_min, x_max, y_max], 1))

交换坐标，从(x, y)交换成(y, x)。这里的tf.split的作用和上面的那个tf.unstack(tf.transpose())是相同的效果。

6. as_tensor_dict

def as_tensor_dict(self, fields=None):
    """Retrieves specified fields as a dictionary of tensors.

    Args:
      fields: (optional) list of fields to return in the dictionary.
        If None (default), all fields are returned.

    Returns:
      tensor_dict: A dictionary of tensors specified by fields.

    Raises:
      ValueError: if specified field is not contained in boxlist.
    """
    tensor_dict = {}
    if fields is None:
        fields = self.get_all_fields()
    for field in fields:
        if not self.has_field(field):
            raise ValueError('boxlist must contain all specified fields')
        tensor_dict[field] = self.get_field(field)
    return tensor_dict

这个是返回指定fields的数据，并且返回的是一个dict。如果field=None，就跟返回data是一样的。

box_list_ops.py

在这个模块中，包括各种常用的对于box处理的op，下面我们逐一进行分析。

1. area, 求bbox的面积。

def area(boxlist, scope=None):
    """Computes area of boxes.

    Args:
      boxlist: BoxList holding N boxes
      scope: name scope.

    Returns:
      a tensor with shape [N] representing box areas.
    """
    with tf.name_scope(scope, 'Area'):
        y_min, x_min, y_max, x_max = tf.split(
            value=boxlist.get(), num_or_size_splits=4, axis=1)
        return tf.squeeze((y_max - y_min) * (x_max - x_min), [1])

这个操作比较好理解，就是长×宽，也就是坐标的最大值-最小值分别得到长和宽，这个对应下面的那个height_width()函数。

2. scale，对bbox进行尺度变换

def scale(boxlist, y_scale, x_scale, scope=None):
    """scale box coordinates in x and y dimensions.

    Args:
      boxlist: BoxList holding N boxes
      y_scale: (float) scalar tensor
      x_scale: (float) scalar tensor
      scope: name scope.

    Returns:
      boxlist: BoxList holding N boxes
    """
    with tf.name_scope(scope, 'Scale'):
        y_scale = tf.cast(y_scale, tf.float32)
        x_scale = tf.cast(x_scale, tf.float32)
        y_min, x_min, y_max, x_max = tf.split(
            value=boxlist.get(), num_or_size_splits=4, axis=1)
        y_min = y_scale * y_min
        y_max = y_scale * y_max
        x_min = x_scale * x_min
        x_max = x_scale * x_max
        scaled_boxlist = box_list.BoxList(
            tf.concat([y_min, x_min, y_max, x_max], 1))
        return _copy_extra_fields(scaled_boxlist, boxlist)

这个是分别对x坐标和y坐标进行尺度变换，然后形成一个新的BoxList类，当然最后如果需要copy其他的字段。

3. clip_to_window，将bbox裁剪到给定的window

def clip_to_window(boxlist, window, filter_nonoverlapping=True, scope=None):
    """Clip bounding boxes to a window.

    This op clips any input bounding boxes (represented by bounding box
    corners) to a window, optionally filtering out boxes that do not
    overlap at all with the window.

    Args:
      boxlist: BoxList holding M_in boxes
      window: a tensor of shape [4] representing the [y_min, x_min, y_max, x_max]
        window to which the op should clip boxes.
      filter_nonoverlapping: whether to filter out boxes that do not overlap at
        all with the window.
      scope: name scope.

    Returns:
      a BoxList holding M_out boxes where M_out <= M_in
    """
    with tf.name_scope(scope, 'ClipToWindow'):
        y_min, x_min, y_max, x_max = tf.split(
            value=boxlist.get(), num_or_size_splits=4, axis=1)
        win_y_min, win_x_min, win_y_max, win_x_max = tf.unstack(window)
        y_min_clipped = tf.maximum(tf.minimum(y_min, win_y_max), win_y_min)
        y_max_clipped = tf.maximum(tf.minimum(y_max, win_y_max), win_y_min)
        x_min_clipped = tf.maximum(tf.minimum(x_min, win_x_max), win_x_min)
        x_max_clipped = tf.maximum(tf.minimum(x_max, win_x_max), win_x_min)
        clipped = box_list.BoxList(
            tf.concat([y_min_clipped, x_min_clipped, y_max_clipped, x_max_clipped],
                      1))
        clipped = _copy_extra_fields(clipped, boxlist)
        if filter_nonoverlapping:
            areas = area(clipped)
            nonzero_area_indices = tf.cast(
                tf.reshape(tf.where(tf.greater(areas, 0.0)), [-1]), tf.int32)
            clipped = gather(clipped, nonzero_area_indices)
        return clipped

cliped的过程比较简单，就不再介绍，里面有一个非常有意思的参数filter_nonoverlapping,这个参数用于控制要不要把里面的没有overlap的bbox去掉。实现起来也是符合逻辑，即去掉里面的面积为零的框。所以

第一步计算当前bbox的面积。

获取为0的bbox对应的index。

gather出来新的bbox。

4. prune_outside_window, prune_completely_outside_window，去掉（全）落在windows外面的Bbox

def prune_outside_window(boxlist, window, scope=None):
    """Prunes bounding boxes that fall outside a given window.

    This function prunes bounding boxes that even partially fall outside the given
    window. See also clip_to_window which only prunes bounding boxes that fall
    completely outside the window, and clips any bounding boxes that partially
    overflow.

    Args:
      boxlist: a BoxList holding M_in boxes.
      window: a float tensor of shape [4] representing [ymin, xmin, ymax, xmax]
        of the window
      scope: name scope.

    Returns:
      pruned_corners: a tensor with shape [M_out, 4] where M_out <= M_in
      valid_indices: a tensor with shape [M_out] indexing the valid bounding boxes
       in the input tensor.
    """
    with tf.name_scope(scope, 'PruneOutsideWindow'):
        y_min, x_min, y_max, x_max = tf.split(
            value=boxlist.get(), num_or_size_splits=4, axis=1)
        win_y_min, win_x_min, win_y_max, win_x_max = tf.unstack(window)
        ...
        valid_indices = tf.reshape(
            tf.where(tf.logical_not(tf.reduce_any(coordinate_violations, 1))), [-1])
        return gather(boxlist, valid_indices), valid_indices

这两个函数类似，唯一不同的就是在最后条件判断的地方。

prune_outside_window

coordinate_violations = tf.concat([
    tf.less(y_min, win_y_min), tf.less(x_min, win_x_min),
    tf.greater(y_max, win_y_max), tf.greater(x_max, win_x_max)
], 1)

在windows外面，即2个多标点，都不在windows的两个坐标点之内，所以任何坐标的(x, y)在windows之外都是不行的，即any。

prune_completely_outside_window

coordinate_violations = tf.concat([
    tf.greater_equal(y_min, win_y_max), tf.greater_equal(x_min, win_x_max),
    tf.less_equal(y_max, win_y_min), tf.less_equal(x_max, win_x_min)
], 1)

完全在window外面，要求任何一个坐标值只要在window坐标值之外就行。

5. intersection，matched_intersection求两个boxlist的相交面积

def intersection(boxlist1, boxlist2, scope=None):
    """Compute pairwise intersection areas between boxes.

    Args:
      boxlist1: BoxList holding N boxes
      boxlist2: BoxList holding M boxes
      scope: name scope.

    Returns:
      a tensor with shape [N, M] representing pairwise intersections
    """
    with tf.name_scope(scope, 'Intersection'):
        y_min1, x_min1, y_max1, x_max1 = tf.split(
            value=boxlist1.get(), num_or_size_splits=4, axis=1)
        y_min2, x_min2, y_max2, x_max2 = tf.split(
            value=boxlist2.get(), num_or_size_splits=4, axis=1)
        all_pairs_min_ymax = tf.minimum(y_max1, tf.transpose(y_max2))
        all_pairs_max_ymin = tf.maximum(y_min1, tf.transpose(y_min2))
        intersect_heights = tf.maximum(0.0, all_pairs_min_ymax - all_pairs_max_ymin)
        all_pairs_min_xmax = tf.minimum(x_max1, tf.transpose(x_max2))
        all_pairs_max_xmin = tf.maximum(x_min1, tf.transpose(x_min2))
        intersect_widths = tf.maximum(0.0, all_pairs_min_xmax - all_pairs_max_xmin)
        return intersect_heights * intersect_widths

这个相交面积有点类似于笛卡尔积的意思，既是求任意一对box的相交面积。这里需要说明的一点是tf.minimum、tf.maximum，在API文档中给出来的，要求y必须要和x的形状相同，这一点如果不了解broadcasting可能会有点费解。

大家可以看到输入到API中的tf.minimum(y_max1, tf.transpose(y_max2))对y_max2进行了转置，也就是x,y的形状不在相同，但是可以看到，如果支持broadcast，可以认为x依次和y中的每个数进行minimum，最后得到的形状就是[N, M]。

相比来说matched_intersection就比较简单了，就是相对应的位置，求相交面积。

6. iou，matched_iou

def iou(boxlist1, boxlist2, scope=None):
    """Computes pairwise intersection-over-union between box collections.

    Args:
      boxlist1: BoxList holding N boxes
      boxlist2: BoxList holding M boxes
      scope: name scope.

    Returns:
      a tensor with shape [N, M] representing pairwise iou scores.
    """
    with tf.name_scope(scope, 'IOU'):
        intersections = intersection(boxlist1, boxlist2)
        areas1 = area(boxlist1)
        areas2 = area(boxlist2)
        unions = (
                tf.expand_dims(areas1, 1) + tf.expand_dims(areas2, 0) - intersections)
        return tf.where(
            tf.equal(intersections, 0.0),
            tf.zeros_like(intersections), tf.truediv(intersections, unions))

求两个boxlist对应的iou，可以看到最核心的点在于计算两个bbox面积之和。

unions = (tf.expand_dims(areas1, 1) + tf.expand_dims(areas2, 0) - intersections)

也就是先把area1和area2分别扩充一个axis，这样分别形成一个[N, 1]和[1, M]的矩阵，然后再通过加法的broadcast，得到一个[N, M]的矩阵，最后减去那个相交面积就是总共面积。

matched_iou就比较简单了，调用对应的matched_intersection方法就行了。

7. ioa计算相交面积占bbox2的比例。

def ioa(boxlist1, boxlist2, scope=None):
    """Computes pairwise intersection-over-area between box collections.

    intersection-over-area (IOA) between two boxes box1 and box2 is defined as
    their intersection area over box2's area. Note that ioa is not symmetric,
    that is, ioa(box1, box2) != ioa(box2, box1).

    Args:
      boxlist1: BoxList holding N boxes
      boxlist2: BoxList holding M boxes
      scope: name scope.

    Returns:
      a tensor with shape [N, M] representing pairwise ioa scores.
    """
    with tf.name_scope(scope, 'IOA'):
        intersections = intersection(boxlist1, boxlist2)
        areas = tf.expand_dims(area(boxlist2), 0)
        return tf.truediv(intersections, areas)

8. prune_non_overlapping_boxes裁掉没有相交的bbox

def prune_non_overlapping_boxes(
        boxlist1, boxlist2, min_overlap=0.0, scope=None):
    """Prunes the boxes in boxlist1 that overlap less than thresh with boxlist2.

    For each box in boxlist1, we want its IOA to be more than minoverlap with
    at least one of the boxes in boxlist2. If it does not, we remove it.

    Args:
      boxlist1: BoxList holding N boxes.
      boxlist2: BoxList holding M boxes.
      min_overlap: Minimum required overlap between boxes, to count them as
                  overlapping.
      scope: name scope.

    Returns:
      new_boxlist1: A pruned boxlist with size [N', 4].
      keep_inds: A tensor with shape [N'] indexing kept bounding boxes in the
        first input BoxList `boxlist1`.
    """
    with tf.name_scope(scope, 'PruneNonOverlappingBoxes'):
        ioa_ = ioa(boxlist2, boxlist1)  # [M, N] tensor
        ioa_ = tf.reduce_max(ioa_, reduction_indices=[0])  # [N] tensor
        keep_bool = tf.greater_equal(ioa_, tf.constant(min_overlap))
        keep_inds = tf.squeeze(tf.where(keep_bool), squeeze_dims=[1])
        new_boxlist1 = gather(boxlist1, keep_inds)
        return new_boxlist1, keep_inds

这个地方有一个min_overlap参数，用于控制相加的比例，这个里面仔细看一下ioa的计算，他计算的是相交面积占boxlist1的比例。

9. prune_small_boxes，去掉边比较小的Bbox

def prune_small_boxes(boxlist, min_side, scope=None):
    """Prunes small boxes in the boxlist which have a side smaller than min_side.

    Args:
      boxlist: BoxList holding N boxes.
      min_side: Minimum width AND height of box to survive pruning.
      scope: name scope.

    Returns:
      A pruned boxlist.
    """
    with tf.name_scope(scope, 'PruneSmallBoxes'):
        height, width = height_width(boxlist)
        is_valid = tf.logical_and(tf.greater_equal(width, min_side),
                                  tf.greater_equal(height, min_side))
        return gather(boxlist, tf.reshape(tf.where(is_valid), [-1]))

这个就是看所有的边都要比最小的边要长就行。

10. change_coordinate_frame，把bbox的坐标归一化window相对坐标

def change_coordinate_frame(boxlist, window, scope=None):
    """Change coordinate frame of the boxlist to be relative to window's frame.

    Given a window of the form [ymin, xmin, ymax, xmax],
    changes bounding box coordinates from boxlist to be relative to this window
    (e.g., the min corner maps to (0,0) and the max corner maps to (1,1)).

    An example use case is data augmentation: where we are given groundtruth
    boxes (boxlist) and would like to randomly crop the image to some
    window (window). In this case we need to change the coordinate frame of
    each groundtruth box to be relative to this new window.

    Args:
      boxlist: A BoxList object holding N boxes.
      window: A rank 1 tensor [4].
      scope: name scope.

    Returns:
      Returns a BoxList object with N boxes.
    """
    with tf.name_scope(scope, 'ChangeCoordinateFrame'):
        win_height = window[2] - window[0]
        win_width = window[3] - window[1]
        boxlist_new = scale(box_list.BoxList(
            boxlist.get() - [window[0], window[1], window[0], window[1]]),
            1.0 / win_height, 1.0 / win_width)
        boxlist_new = _copy_extra_fields(boxlist_new, boxlist)
        return boxlist_new

11. boolean_mask，根据bool值获取到True对应的Bbox

def boolean_mask(boxlist, indicator, fields=None, scope=None,
                 use_static_shapes=False, indicator_sum=None):
    """Select boxes from BoxList according to indicator and return new BoxList.

    `boolean_mask` returns the subset of boxes that are marked as "True" by the
    indicator tensor. By default, `boolean_mask` returns boxes corresponding to
    the input index list, as well as all additional fields stored in the boxlist
    (indexing into the first dimension).  However one can optionally only draw
    from a subset of fields.

    Args:
      boxlist: BoxList holding N boxes
      indicator: a rank-1 boolean tensor
      fields: (optional) list of fields to also gather from.  If None (default),
        all fields are gathered from.  Pass an empty fields list to only gather
        the box coordinates.
      scope: name scope.
      use_static_shapes: Whether to use an implementation with static shape
        gurantees.
      indicator_sum: An integer containing the sum of `indicator` vector. Only
        required if `use_static_shape` is True.

    Returns:
      subboxlist: a BoxList corresponding to the subset of the input BoxList
        specified by indicator
    Raises:
      ValueError: if `indicator` is not a rank-1 boolean tensor.
    """
    with tf.name_scope(scope, 'BooleanMask'):
        if indicator.shape.ndims != 1:
            raise ValueError('indicator should have rank 1')
        if indicator.dtype != tf.bool:
            raise ValueError('indicator should be a boolean tensor')
        if use_static_shapes:
            if not (indicator_sum and isinstance(indicator_sum, int)):
                raise ValueError('`indicator_sum` must be a of type int')
            selected_positions = tf.to_float(indicator)
            indexed_positions = tf.cast(
                tf.multiply(
                    tf.cumsum(selected_positions), selected_positions),
                dtype=tf.int32)
            one_hot_selector = tf.one_hot(
                indexed_positions - 1, indicator_sum, dtype=tf.float32)
            sampled_indices = tf.cast(
                tf.tensordot(
                    tf.to_float(tf.range(tf.shape(indicator)[0])),
                    one_hot_selector,
                    axes=[0, 0]),
                dtype=tf.int32)
            return gather(boxlist, sampled_indices, use_static_shapes=True)
        else:
            subboxlist = box_list.BoxList(tf.boolean_mask(boxlist.get(), indicator))
            if fields is None:
                fields = boxlist.get_extra_fields()
            for field in fields:
                if not boxlist.has_field(field):
                    raise ValueError('boxlist must contain all specified fields')
                subfieldlist = tf.boolean_mask(boxlist.get_field(field), indicator)
                subboxlist.add_field(field, subfieldlist)
            return subboxlist

这个实现很有意思，先看看use_static_shapes=False的情况，这个情况比较简单，就是用用tensorflow自带的tf.boolean_mask函数就行，如果有需要就获取对应的field。

主要看一下这个use_static_shapes=True的情况。这个意思就是，使用这个静态的筛选的box的个数，下面来看看它的具体实现：

selected_positions = tf.to_float(indicator)
indexed_positions = tf.cast(
    tf.multiply(tf.cumsum(selected_positions), selected_positions), dtype=tf.int32)
one_hot_selector = tf.one_hot(indexed_positions - 1, indicator_sum, dtype=tf.float32)
sampled_indices = tf.cast(tf.tensordot(
        tf.to_float(tf.range(tf.shape(indicator)[0])),
        one_hot_selector,
        axes=[0, 0]),
    dtype=tf.int32)
return gather(boxlist, sampled_indices, use_static_shapes=True)

把indicator从bool型变量转化成float类型。

获取索引之后的位置，这里有一个tensorflow的API, tf.cumsum，相当于是一个累进求职，即tf.cumsum([a, b, c]) # [a, a + b, a + b + c]，eg. [0,0,1,2,2,3]类似的情况。这个是获得新的筛选之后对应的index，但是在乘以selected_positions之后，变成[0, 0, 1, 2, 0, 3]。

下面就是把这个index之后的tensor，转化成one_hot。这里面有个indexed_positions - 1, 也就是把那些为0的index变道off_value。

下面就是通过这个对应的indexer，进行点积，点积的效果就是index只与hot的点变成1，其余都是0。这样就获得了最终的static_indices。

12. gather, 汇总所有indices的tensor

def gather(boxlist, indices, fields=None, scope=None, use_static_shapes=False):
    """Gather boxes from BoxList according to indices and return new BoxList.

    By default, `gather` returns boxes corresponding to the input index list, as
    well as all additional fields stored in the boxlist (indexing into the
    first dimension).  However one can optionally only gather from a
    subset of fields.

    Args:
      boxlist: BoxList holding N boxes
      indices: a rank-1 tensor of type int32 / int64
      fields: (optional) list of fields to also gather from.  If None (default),
        all fields are gathered from.  Pass an empty fields list to only gather
        the box coordinates.
      scope: name scope.
      use_static_shapes: Whether to use an implementation with static shape
        gurantees.

    Returns:
      subboxlist: a BoxList corresponding to the subset of the input BoxList
      specified by indices
    Raises:
      ValueError: if specified field is not contained in boxlist or if the
        indices are not of type int32
    """
    with tf.name_scope(scope, 'Gather'):
        if len(indices.shape.as_list()) != 1:
            raise ValueError('indices should have rank 1')
        if indices.dtype != tf.int32 and indices.dtype != tf.int64:
            raise ValueError('indices should be an int32 / int64 tensor')
        gather_op = tf.gather
        if use_static_shapes:
            gather_op = ops.matmul_gather_on_zeroth_axis
        subboxlist = box_list.BoxList(gather_op(boxlist.get(), indices))
        if fields is None:
            fields = boxlist.get_extra_fields()
        fields += ['boxes']
        for field in fields:
            if not boxlist.has_field(field):
                raise ValueError('boxlist must contain all specified fields')
            subfieldlist = gather_op(boxlist.get_field(field), indices)
            subboxlist.add_field(field, subfieldlist)
        return subboxlist

13. concatenate，连接多个boxlist的data数据，形成新的BoxList

def concatenate(boxlists, fields=None, scope=None):
    """Concatenate list of BoxLists.

    This op concatenates a list of input BoxLists into a larger BoxList.  It also
    handles concatenation of BoxList fields as long as the field tensor shapes
    are equal except for the first dimension.

    Args:
      boxlists: list of BoxList objects
      fields: optional list of fields to also concatenate.  By default, all
        fields from the first BoxList in the list are included in the
        concatenation.
      scope: name scope.

    Returns:
      a BoxList with number of boxes equal to
        sum([boxlist.num_boxes() for boxlist in BoxList])
    Raises:
      ValueError: if boxlists is invalid (i.e., is not a list, is empty, or
        contains non BoxList objects), or if requested fields are not contained in
        all boxlists
    """
    with tf.name_scope(scope, 'Concatenate'):
        if not isinstance(boxlists, list):
            raise ValueError('boxlists should be a list')
        if not boxlists:
            raise ValueError('boxlists should have nonzero length')
        for boxlist in boxlists:
            if not isinstance(boxlist, box_list.BoxList):
                raise ValueError('all elements of boxlists should be BoxList objects')
        concatenated = box_list.BoxList(
            tf.concat([boxlist.get() for boxlist in boxlists], 0))
        if fields is None:
            fields = boxlists[0].get_extra_fields()
        for field in fields:
            first_field_shape = boxlists[0].get_field(field).get_shape().as_list()
            first_field_shape[0] = -1
            if None in first_field_shape:
                raise ValueError('field %s must have fully defined shape except for the'
                                 ' 0th dimension.' % field)
            for boxlist in boxlists:
                if not boxlist.has_field(field):
                    raise ValueError('boxlist must contain all requested fields')
                field_shape = boxlist.get_field(field).get_shape().as_list()
                field_shape[0] = -1
                if field_shape != first_field_shape:
                    raise ValueError('field %s must have same shape for all boxlists '
                                     'except for the 0th dimension.' % field)
            concatenated_field = tf.concat(
                [boxlist.get_field(field) for boxlist in boxlists], 0)
            concatenated.add_field(field, concatenated_field)
        return concatenated

这个里面，要求所有的fields在所有的boslist变量里面存在。

14. sort_by_field，对制定的field进行排序

def sort_by_field(boxlist, field, order=SortOrder.descend, scope=None):
    """Sort boxes and associated fields according to a scalar field.

    A common use case is reordering the boxes according to descending scores.

    Args:
      boxlist: BoxList holding N boxes.
      field: A BoxList field for sorting and reordering the BoxList.
      order: (Optional) descend or ascend. Default is descend.
      scope: name scope.

    Returns:
      sorted_boxlist: A sorted BoxList with the field in the specified order.

    Raises:
      ValueError: if specified field does not exist
      ValueError: if the order is not either descend or ascend
    """
    with tf.name_scope(scope, 'SortByField'):
        if order != SortOrder.descend and order != SortOrder.ascend:
            raise ValueError('Invalid sort order')

        field_to_sort = boxlist.get_field(field)
        if len(field_to_sort.shape.as_list()) != 1:
            raise ValueError('Field should have rank 1')

        num_boxes = boxlist.num_boxes()
        num_entries = tf.size(field_to_sort)
        length_assert = tf.Assert(
            tf.equal(num_boxes, num_entries),
            ['Incorrect field size: actual vs expected.', num_entries, num_boxes])

        with tf.control_dependencies([length_assert]):
            _, sorted_indices = tf.nn.top_k(field_to_sort, num_boxes, sorted=True)

        if order == SortOrder.ascend:
            sorted_indices = tf.reverse_v2(sorted_indices, [0])

        return gather(boxlist, sorted_indices)

我原来不知道在tensorflow里面怎么对tensor的值进行sort，现在发现其实tf.nn.top_k可以实现这个点，这个函数只能接受值sort，所以sort_by_field这个函数实现的过程中有个assert：

num_boxes = boxlist.num_boxes()
num_entries = tf.size(field_to_sort)
length_assert = tf.Assert(
    tf.equal(num_boxes, num_entries),
    ['Incorrect field size: actual vs expected.', num_entries, num_boxes])

with tf.control_dependencies([length_assert]):
    _, sorted_indices = tf.nn.top_k(field_to_sort, num_boxes, sorted=True)

也就是boxlist里面的num_boxes必须要和field里面的数值个数相同。

15. visualize_boxes_in_image，在图像中可视化Bbox

def visualize_boxes_in_image(image, boxlist, normalized=False, scope=None):
    """Overlay bounding box list on image.

    Currently this visualization plots a 1 pixel thick red bounding box on top
    of the image.  Note that tf.image.draw_bounding_boxes essentially is
    1 indexed.

    Args:
      image: an image tensor with shape [height, width, 3]
      boxlist: a BoxList
      normalized: (boolean) specify whether corners are to be interpreted
        as absolute coordinates in image space or normalized with respect to the
        image size.
      scope: name scope.

    Returns:
      image_and_boxes: an image tensor with shape [height, width, 3]
    """
    with tf.name_scope(scope, 'VisualizeBoxesInImage'):
        if not normalized:
            height, width, _ = tf.unstack(tf.shape(image))
            boxlist = scale(boxlist,
                            1.0 / tf.cast(height, tf.float32),
                            1.0 / tf.cast(width, tf.float32))
        corners = tf.expand_dims(boxlist.get(), 0)
        image = tf.expand_dims(image, 0)
        return tf.squeeze(tf.image.draw_bounding_boxes(image, corners), [0])

这个函数很有用，就是在tf.summary的时候，可视化中间任何图像对应的bbox。这里有个normalized用来表示当前的bbox是不是被归一化了，因为tf.image.draw_bounding_boxes只接受相对坐标值。

16 filter_field_value_equals, filter_scores_greater_than获取值相等的field或者大于的某个阈值的Bbox

def filter_field_value_equals(boxlist, field, value, scope=None):
    """Filter to keep only boxes with field entries equal to the given value.

    Args:
      boxlist: BoxList holding N boxes.
      field: field name for filtering.
      value: scalar value.
      scope: name scope.

    Returns:
      a BoxList holding M boxes where M <= N

    Raises:
      ValueError: if boxlist not a BoxList object or if it does not have
        the specified field.
    """
    with tf.name_scope(scope, 'FilterFieldValueEquals'):
        if not isinstance(boxlist, box_list.BoxList):
            raise ValueError('boxlist must be a BoxList')
        if not boxlist.has_field(field):
            raise ValueError('boxlist must contain the specified field')
        filter_field = boxlist.get_field(field)
        gather_index = tf.reshape(tf.where(tf.equal(filter_field, value)), [-1])
        return gather(boxlist, gather_index)

这两个函数相对简单，就是获取对应field的值等于或者大于某个值的Bbox。

17. pad_or_clip_box_list, 填充或者补足对应的box的长度

def pad_or_clip_box_list(boxlist, num_boxes, scope=None):
    """Pads or clips all fields of a BoxList.

    Args:
      boxlist: A BoxList with arbitrary of number of boxes.
      num_boxes: First num_boxes in boxlist are kept.
        The fields are zero-padded if num_boxes is bigger than the
        actual number of boxes.
      scope: name scope.

    Returns:
      BoxList with all fields padded or clipped.
    """
    with tf.name_scope(scope, 'PadOrClipBoxList'):
        subboxlist = box_list.BoxList(shape_utils.pad_or_clip_tensor(
            boxlist.get(), num_boxes))
        for field in boxlist.get_extra_fields():
            subfield = shape_utils.pad_or_clip_tensor(
                boxlist.get_field(field), num_boxes)
            subboxlist.add_field(field, subfield)
        return subboxlist

你可能感兴趣的:(Dr.Sure)

转发--目前开源数据集整理 Alen_Ii 计算机视觉CV 公开数据集
---------------------本文来自忘情摆渡的CSDN博客，全文地址请点击：https://blog.csdn.net/wangqingbaidu/article/details/80635618?utm_source=copyAttention!我的Dr.Sure项目正式上线了，主旨在分享学习Tensorflow以及DeepLearning中的一些想法。期间随时更新我的论文心得以及
tf.identity && tf.control_dependencies 忘情摆渡 Dr.Sure
Attention!我的Dr.Sure项目正式上线了，主旨在分享学习Tensorflow以及DeepLearning中的一些想法。期间随时更新我的论文心得以及想法。Github地址：https://github.com/wangqingbaidu/Dr.SureCSDN地址：http://blog.csdn.net/wangqingbaidu个人博客地址：http://www.wangqingba
tf.contrib.layers.optimize_loss 忘情摆渡 Dr.Sure
Attention!我的Dr.Sure项目正式上线了，主旨在分享学习Tensorflow以及DeepLearning中的一些想法。期间随时更新我的论文心得以及想法。Github地址：https://github.com/wangqingbaidu/Dr.SureCSDN地址：http://blog.csdn.net/wangqingbaidu个人博客地址：http://www.wangqingba
目前开源数据集整理忘情摆渡 Dr.Sure
Attention!我的Dr.Sure项目正式上线了，主旨在分享学习Tensorflow以及DeepLearning中的一些想法。期间随时更新我的论文心得以及想法。Github地址：https://github.com/wangqingbaidu/Dr.SureCSDN地址：http://blog.csdn.net/wangqingbaidu个人博客地址：http://www.wangqingba
Image-Text Matching and VQA 忘情摆渡 Dr.Sure Dr.Sure
Attention!我的Dr.Sure项目正式上线了，主旨在分享学习Tensorflow以及DeepLearning中的一些想法。期间随时更新我的论文心得以及想法。Github地址：https://github.com/wangqingbaidu/Dr.SureCSDN地址：http://blog.csdn.net/wangqingbaidu个人博客地址：http://www.wangqingba
java观察者模式 3213213333332132 java 设计模式游戏观察者模式
观察者模式——顾名思义，就是一个对象观察另一个对象，当被观察的对象发生变化时，观察者也会跟着变化。在日常中，我们配java环境变量时，设置一个JAVAHOME变量,这就是被观察者，使用了JAVAHOME变量的对象都是观察者，一旦JAVAHOME的路径改动，其他的也会跟着改动。这样的例子很多，我想用小时候玩的老鹰捉小鸡游戏来简单的描绘观察者模式。老鹰会变成观察者，母鸡和小鸡是
TFS RESTful API 模拟上传测试 ronin47
TFS RESTful API 模拟上传测试。　　细节参看这里：https://github.com/alibaba/nginx-tfs/blob/master/TFS_RESTful_API.markdown 模拟POST上传一个图片： curl --data-binary @/opt/tfs.png http
PHP常用设计模式单例, 工厂, 观察者, 责任链, 装饰, 策略,适配,桥接模式 dcj3sjt126com 设计模式 PHP
// 多态, 在JAVA中是这样用的, 其实在PHP当中可以自然消除, 因为参数是动态的, 你传什么过来都可以, 不限制类型, 直接调用类的方法 abstract class Tiger { public abstract function climb(); } class XTiger extends Tiger { public function climb()
hibernate 171815164 Hibernate
main,save Configuration conf =new Configuration().configure(); SessionFactory sf=conf.buildSessionFactory(); Session sess=sf.openSession(); Transaction tx=sess.beginTransaction(); News a=new
Ant实例分析 g21121 ant
下面是一个Ant构建文件的实例，通过这个实例我们可以很清楚的理顺构建一个项目的顺序及依赖关系，从而编写出更加合理的构建文件。下面是build.xml的代码： <?xml version="1
[简单]工作记录_接口返回405原因 53873039oycg 工作
最近调接口时候一直报错，错误信息是: responseCode:405 responseMsg:Method Not Allowed 接口请求方式Post.
关于java.lang.ClassNotFoundException 和 java.lang.NoClassDefFoundError 的区别程序员是怎么炼成的
真正完成类的加载工作是通过调用 defineClass来实现的；而启动类的加载过程是通过调用 loadClass来实现的；就是类加载器分为加载和定义 protected Class<?> findClass(String name) throws ClassNotFoundExcept
JDBC学习笔记-JDBC详细的操作流程 aijuans jdbc
所有的JDBC应用程序都具有下面的基本流程：　　1、加载数据库驱动并建立到数据库的连接。　　2、执行SQL语句。　　3、处理结果。　　4、从数据库断开连接释放资源。下面我们就来仔细看一看每一个步骤：其实按照上面所说每个阶段都可得单独拿出来写成一个独立的类方法文件。共别的应用来调用。 1、加载数据库驱动并建立到数据库的连接： Html代码 St
rome创建rss antonyup_2006 tomcat cms xml struts Opera
引用 1.RSS标准 RSS标准比较混乱，主要有以下3个系列 RSS 0.9x / 2.0 : RSS技术诞生于1999年的网景公司(Netscape)，其发布了一个0.9版本的规范。2001年，RSS技术标准的发展工作被Userland Software公司的戴夫温那(Dave Winer)所接手。陆续发布了0.9x的系列版本。当W3C小组发布RSS 1.0后，Dave W
html表格和表单基础百合不是茶 html 表格表单 meta 锚点
第一次用html来写东西,感觉压力山大,每次看见别人发的都是比较牛逼的再看看自己什么都还不会, html是一种标记语言,其实很简单都是固定的格式 _----------------------------------------表格和表单表格是html的重要组成部分,表格用在body里面的主要用法如下; <table> &
ibatis如何传入完整的sql语句 bijian1013 java sql ibatis
ibatis如何传入完整的sql语句？进一步说，String str ="select * from test_table"，我想把str传入ibatis中执行，是传递整条sql语句。解决办法： <
精通Oracle10编程SQL(14)开发动态SQL bijian1013 oracle 数据库 plsql
/* *开发动态SQL */ --使用EXECUTE IMMEDIATE处理DDL操作 CREATE OR REPLACE PROCEDURE drop_table(table_name varchar2) is sql_statement varchar2(100); begin sql_statement:='DROP TABLE '||table_name;
【Linux命令】Linux工作中常用命令 bit1129 linux命令
不断的总结工作中常用的Linux命令 1.查看端口被哪个进程占用通过这个命令可以得到占用8085端口的进程号，然后通过ps -ef|grep 进程号得到进程的详细信息 netstat -anp | grep 8085 察看进程ID对应的进程占用的端口号 netstat -anp | grep 进程ID &
优秀网站和文档收集白糖_ 网站
集成 Flex, Spring, Hibernate 构建应用程序性能测试工具-JMeter Hmtl5-IOCN网站 Oracle精简版教程网站鸟哥的linux私房菜 Jetty中文文档 50个jquery必备代码片段 swfobject.js检测flash版本号工具
angular.extend boyitech AngularJS angular.extend AngularJS API
angular.extend 复制src对象中的属性去dst对象中. 支持多个src对象. 如果你不想改变一个对象，你可以把dst设为空对象{}: var object = angular.extend({}, object1, object2). 注意: angular.extend不支持递归复制. 使用方法: angular.extend(dst, src); 参数:
java-谷歌面试题-设计方便提取中数的数据结构 bylijinnan java
网上找了一下这道题的解答，但都是提供思路，没有提供具体实现。其中使用大小堆这个思路看似简单，但实现起来要考虑很多。以下分别用排序数组和大小堆来实现。使用大小堆： import java.util.Arrays; public class MedianInHeap { /** * 题目：设计方便提取中数的数据结构 * 设计一个数据结构，其中包含两个函数，1.插
ajaxFileUpload 针对 ie jquery 1.7+不能使用问题修复版本 Chen.H ajaxFileUpload ie6 ie7 ie8 ie9
jQuery.extend({ handleError: function( s, xhr, status, e ) { // If a local callback was specified, fire it if ( s.error ) { s.error.call( s.context || s, xhr, status, e ); }
[机器人制造原则]机器人的电池和存储器必须可以替换 comsci 制造
机器人的身体随时随地可能被外来力量所破坏,但是如果机器人的存储器和电池可以更换,那么这个机器人的思维和记忆力就可以保存下来,即使身体受到伤害,在把存储器取下来安装到一个新的身体上之后,原有的性格和能力都可以继续维持..... 另外,如果一
Oracle Multitable INSERT 的用法 daizj oracle
转载Oracle笔记-Multitable INSERT 的用法 http://blog.chinaunix.net/uid-8504518-id-3310531.html 一、Insert基础用法语法： Insert Into 表名 (字段1,字段2,字段3...） Values (值1,
专访黑客历史学家George Dyson datamachine on
20世纪最具威力的两项发明——核弹和计算机出自同一时代、同一群年青人。可是，与大名鼎鼎的曼哈顿计划（第二次世界大战中美国原子弹研究计划）相比，计算机的起源显得默默无闻。出身计算机世家的历史学家George Dyson在其新书《图灵大教堂》（Turing’s Cathedral）中讲述了阿兰·图灵、约翰·冯·诺依曼等一帮子天才小子创造计算机及预见计算机未来
小学6年级英语单词背诵第一课 dcj3sjt126com english word
always 总是 rice 水稻，米饭 before 在...之前 live 生活，居住 usual 通常的 early 早的 begin 开始 month 月份 year 年 last 最后的 east 东方的 high 高的 far 远的 window 窗户 world 世界 than 比...更
在线IT教育和在线IT高端教育 dcj3sjt126com 教育
codecademy http://www.codecademy.com codeschool https://www.codeschool.com teamtreehouse http://teamtreehouse.com lynda http://www.lynda.com/ Coursera https://www.coursera.
Struts2 xml校验框架所定义的校验文件蕃薯耀 Struts2 xml校验 Struts2 xml校验框架 Struts2校验
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 蕃薯耀 2015年7月11日 15:54:59 星期六 http://fa
mac下安装rar和unrar命令 hanqunfeng mac
1.下载：http://www.rarlab.com/download.htm 选择 RAR 5.21 for Mac OS X 2.解压下载后的文件 tar -zxvf rarosx-5.2.1.tar 3.cd rar sudo install -c -o $USER unrar /bin #输入当前用户登录密码 sudo install -c -o $USER rar
三种将list转换为map的方法 jackyrong list
在本文中，介绍三种将list转换为map的方法： 1）传统方法假设有某个类如下 class Movie { private Integer rank; private String description; public Movie(Integer rank, String des
年轻程序员需要学习的5大经验 lampcy 工作 PHP 程序员
在过去的7年半时间里，我带过的软件实习生超过一打，也看到过数以百计的学生和毕业生的档案。我发现很多事情他们都需要学习。或许你会说，我说的不就是某种特定的技术、算法、数学，或者其他特定形式的知识吗？没错，这的确是需要学习的，但却并不是最重要的事情。他们需要学习的最重要的东西是“自我规范”。这些规范就是：尽可能地写出最简洁的代码；如果代码后期会因为改动而变得凌乱不堪就得重构；尽量删除没用的代码，并添加
评“女孩遭野蛮引产致终身不育 60万赔偿款1分未得”医腐深入骨髓 nannan408
先来看南方网的一则报道：再正常不过的结婚、生子，对于29岁的郑畅来说，却是一个永远也无法实现的梦想。从2010年到2015年，从24岁到29岁，一张张新旧不一的诊断书记录了她病情的同时，也清晰地记下了她人生的悲哀。　　粗暴手术让人发寒　　2010年7月，在酒店做服务员的郑畅发现自己怀孕了，可男朋友却联系不上。在没有和家人商量的情况下，她决定堕胎。　　12月5日，
使用jQuery为input输入框绑定回车键事件 VS 为a标签绑定click事件 Everyday都不同 jsp input 回车键绑定 click enter
假设如题所示的事件为同一个，必须先把该js函数抽离出来，该函数定义了监听的处理： function search() { //监听函数略...... } 为input框绑定回车事件，当用户在文本框中输入搜索关键字时，按回车键，即可触发search(): //回车绑定 $(".search").keydown(fun
EXT学习记录 tntxia ext
1. 准备（1）官网：http://www.sencha.com/ 里面有源代码和API文档下载。 EXT的域名已经从www.extjs.com改成了www.sencha.com ，但extjs这个域名会自动转到sencha上。（2）帮助文档：想要查看EXT的官方文档的话，可以去这里h
mybatis3的mapper文件报Referenced file contains errors xingguangsixian mybatis
最近使用mybatis.3.1.0时无意中碰到一个问题： The errors below were detected when validating the file "mybatis-3-mapper.dtd" via the file "account-mapper.xml". In most cases these errors can be d