timm.data-------tf_processing

timm.data-------tf_processing

  • 模块函数理解
    • distorted_bounding_box_crop
      • 1.源码阅读
      • 2.参数理解
    • _decode_and_random_crop
      • 1.源码阅读
    • _decode_and_center_crop
      • 1. 源码阅读
    • preprocess_for_train
      • 1.源码阅读
    • preprocess_for_eval
      • 1.源码阅读
    • class TfPreprocessTransform:
      • 1.源码阅读

模块函数理解

distorted_bounding_box_crop

1.源码阅读

def distorted_bounding_box_crop(image_bytes,
                                bbox,
                                min_object_covered=0.1,
                                aspect_ratio_range=(0.75, 1.33),
                                area_range=(0.05, 1.0),
                                max_attempts=100,
                                scope=None):
    """Generates cropped_image using one of the bboxes randomly distorted.

    See `tf.image.sample_distorted_bounding_box` for more documentation.

    Args:
      image_bytes: `Tensor` of binary image data.
      bbox: `Tensor` of bounding boxes arranged `[1, num_boxes, coords]`
          where each coordinate is [0, 1) and the coordinates are arranged
          as `[ymin, xmin, ymax, xmax]`. If num_boxes is 0 then use the whole
          image.
      min_object_covered: An optional `float`. Defaults to `0.1`. The cropped
          area of the image must contain at least this fraction of any bounding
          box supplied.
      aspect_ratio_range: An optional list of `float`s. The cropped area of the
          image must have an aspect ratio = width / height within this range.
      area_range: An optional list of `float`s. The cropped area of the image
          must contain a fraction of the supplied image within in this range.
      max_attempts: An optional `int`. Number of attempts at generating a cropped
          region of the image of the specified constraints. After `max_attempts`
          failures, return the entire image.
      scope: Optional `str` for name scope.
    Returns:
      cropped image `Tensor`
    """
    with tf.name_scope(scope, 'distorted_bounding_box_crop', [image_bytes, bbox]):
        shape = tf.image.extract_jpeg_shape(image_bytes)
        sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
            shape,
            bounding_boxes=bbox,
            min_object_covered=min_object_covered,
            aspect_ratio_range=aspect_ratio_range,
            area_range=area_range,
            max_attempts=max_attempts,
            use_image_if_no_bounding_boxes=True)
        bbox_begin, bbox_size, _ = sample_distorted_bounding_box

        # Crop the image to the specified bounding box.
        offset_y, offset_x, _ = tf.unstack(bbox_begin)
        target_height, target_width, _ = tf.unstack(bbox_size)
        crop_window = tf.stack([offset_y, offset_x, target_height, target_width])
        image = tf.image.decode_and_crop_jpeg(image_bytes, crop_window, channels=3)

        return image

2.参数理解

image_bytes: Tensor类型img或者是二进制的 iamge data.
bbox :Tensor类型 [1, num_boxes, coords],coordinate被归一化到[0,1)并且coordinates=[ymin, xmin, ymax, xmax]。如果num_boxes=0的话使用的就是整张图像.
min_object_coverd:类型-float,裁剪的图像必须占有任意一个box的一部分,百分比为min_object_coverd=0.1(default).
**
**:裁剪完的图像的aspect_ratio=width/height
area_range : 裁剪图像面积占原始图像面积的百分比
max_attempts :尝试进行裁剪的次数,知道裁剪成功否则返回整张图像
1.产生bonding box参数:

sample_distorted_bounding_box = tf.image.sample_distorted_bounding_box(
            shape,
            bounding_boxes=bbox,
            min_object_covered=min_object_covered,
            aspect_ratio_range=aspect_ratio_range,
            area_range=area_range,
            max_attempts=max_attempts,
            use_image_if_no_bounding_boxes=True)

2.通过tf.unstack把获取的参数拆解出来,并且合成crop_window区域:

offset_y, offset_x, _ = tf.unstack(bbox_begin)
        target_height, target_width, _ = tf.unstack(bbox_size)
        crop_window = tf.stack([offset_y, offset_x, target_height, target_width])

3.获取裁剪后的图片

image = tf.image.decode_and_crop_jpeg(image_bytes, crop_window, channels=3)

_decode_and_random_crop

1.源码阅读

def _decode_and_random_crop(image_bytes, image_size, resize_method):
    """Make a random crop of image_size."""
    bbox = tf.constant([0.0, 0.0, 1.0, 1.0], dtype=tf.float32, shape=[1, 1, 4])
    image = distorted_bounding_box_crop(
        image_bytes,
        bbox,
        min_object_covered=0.1,
        aspect_ratio_range=(3. / 4, 4. / 3.),
        area_range=(0.08, 1.0),
        max_attempts=10,
        scope=None)
    original_shape = tf.image.extract_jpeg_shape(image_bytes)
    bad = _at_least_x_are_equal(original_shape, tf.shape(image), 3)

    image = tf.cond(
        bad,
        lambda: _decode_and_center_crop(image_bytes, image_size),
        lambda: tf.image.resize([image], [image_size, image_size], resize_method)[0])

    return image

先调用distorted_bounding_box_crop 得到crop后的图像,在调用tf.image.resize()得到最后随机裁剪的图像。

_decode_and_center_crop

中心裁剪

1. 源码阅读

def _decode_and_center_crop(image_bytes, image_size, resize_method):
    """Crops to center of image with padding then scales image_size."""
    shape = tf.image.extract_jpeg_shape(image_bytes)
    image_height = shape[0]
    image_width = shape[1]

    padded_center_crop_size = tf.cast(
        ((image_size / (image_size + CROP_PADDING)) *
         tf.cast(tf.minimum(image_height, image_width), tf.float32)),
        tf.int32)

    offset_height = ((image_height - padded_center_crop_size) + 1) // 2
    offset_width = ((image_width - padded_center_crop_size) + 1) // 2
    crop_window = tf.stack([offset_height, offset_width,
                            padded_center_crop_size, padded_center_crop_size])
    image = tf.image.decode_and_crop_jpeg(image_bytes, crop_window, channels=3)
    image = tf.image.resize([image], [image_size, image_size], resize_method)[0]

    return image

preprocess_for_train

1.源码阅读

def preprocess_for_train(image_bytes, use_bfloat16, image_size=IMAGE_SIZE, interpolation='bicubic'):
    """Preprocesses the given image for evaluation.

    Args:
      image_bytes: `Tensor` representing an image binary of arbitrary size.
      use_bfloat16: `bool` for whether to use bfloat16.
      image_size: image size.
      interpolation: image interpolation method

    Returns:
      A preprocessed image `Tensor`.
    """
    resize_method = tf.image.ResizeMethod.BICUBIC if interpolation == 'bicubic' else tf.image.ResizeMethod.BILINEAR
    image = _decode_and_random_crop(image_bytes, image_size, resize_method)
    image = _flip(image)
    image = tf.reshape(image, [image_size, image_size, 3])
    image = tf.image.convert_image_dtype(
        image, dtype=tf.bfloat16 if use_bfloat16 else tf.float32)
    return image

把图像转换为训练的数据类型

preprocess_for_eval

1.源码阅读

def preprocess_for_eval(image_bytes, use_bfloat16, image_size=IMAGE_SIZE, interpolation='bicubic'):
    """Preprocesses the given image for evaluation.

    Args:
      image_bytes: `Tensor` representing an image binary of arbitrary size.
      use_bfloat16: `bool` for whether to use bfloat16.
      image_size: image size.
      interpolation: image interpolation method

    Returns:
      A preprocessed image `Tensor`.
    """
    resize_method = tf.image.ResizeMethod.BICUBIC if interpolation == 'bicubic' else tf.image.ResizeMethod.BILINEAR
    image = _decode_and_center_crop(image_bytes, image_size, resize_method)
    image = tf.reshape(image, [image_size, image_size, 3])
    image = tf.image.convert_image_dtype(
        image, dtype=tf.bfloat16 if use_bfloat16 else tf.float32)
    return image

把图像转换为测试的数据类型

class TfPreprocessTransform:

1.源码阅读

class TfPreprocessTransform:

    def __init__(self, is_training=False, size=224, interpolation='bicubic'):
        self.is_training = is_training
        self.size = size[0] if isinstance(size, tuple) else size
        self.interpolation = interpolation
        self._image_bytes = None
        self.process_image = self._build_tf_graph()
        self.sess = None

    def _build_tf_graph(self):
        with tf.device('/cpu:0'):
            self._image_bytes = tf.placeholder(
                shape=[],
                dtype=tf.string,
            )
            img = preprocess_image(
                self._image_bytes, self.is_training, False, self.size, self.interpolation)
        return img

    def __call__(self, image_bytes):
        if self.sess is None:
            self.sess = tf.Session()
        img = self.sess.run(self.process_image, feed_dict={self._image_bytes: image_bytes})
        img = img.round().clip(0, 255).astype(np.uint8)
        if img.ndim < 3:
            img = np.expand_dims(img, axis=-1)
        img = np.rollaxis(img, 2)  # HWC to CHW
        return img

就是把之前的功能集成到这个类里面,完成利用tf实现图像裁剪和resize的预处理。

你可能感兴趣的:(tensorflow,tensorflow)