[拆轮子] PaddleDetection 中的预处理 NormalizeImage

相对路径在这里 ppdet/data/transform/operators.py

上一篇 https://blog.csdn.net/HaoZiHuang/article/details/128398000 中略讲了其基类 BaseOperator

__init__ 中初始化了 self._id 比如下边的这个类实例化后,打印一下这个属性是:

>>> self._id
'NormalizeImage_d78ed6'
class NormalizeImage(BaseOperator):
    def __init__(self,
                 mean=[0.485, 0.456, 0.406],
                 std=[0.229, 0.224, 0.225],
                 is_scale=True,
                 norm_type='mean_std'):
        """
        Args:
            mean (list): the pixel mean
            std (list): the pixel variance
            is_scale (bool): scale the pixel to [0,1]
            norm_type (str): type in ['mean_std', 'none']
        """
        super(NormalizeImage, self).__init__()
        self.mean = mean
        self.std = std
        self.is_scale = is_scale
        self.norm_type = norm_type
        if not (isinstance(self.mean, list) and isinstance(self.std, list) and
                isinstance(self.is_scale, bool) and
                self.norm_type in ['mean_std', 'none']):
            raise TypeError("{}: input type is invalid.".format(self))
        from functools import reduce
        if reduce(lambda x, y: x * y, self.std) == 0:
            raise ValueError('{}: std is invalid!'.format(self))

    def apply(self, sample, context=None):
        """Normalize the image.
        Operators:
            1.(optional) Scale the pixel to [0,1]
            2.(optional) Each pixel minus mean and is divided by std
        """
        im = sample['image']
        im = im.astype(np.float32, copy=False)
        if self.is_scale:
            scale = 1.0 / 255.0
            im *= scale

        if self.norm_type == 'mean_std':
            mean = np.array(self.mean)[np.newaxis, np.newaxis, :]
            std = np.array(self.std)[np.newaxis, np.newaxis, :]
            im -= mean
            im /= std
        sample['image'] = im
        return sample

self.meanself.std 分别是用来对图片进行正则化参数,分别是 [0.485, 0.456, 0.406], [0.229, 0.224, 0.225]

如果 self.is_scaleTrue,则用255对原图先进行归一化
如果 self.norm_typenone,则不对图片进行正则化,如果为 'mean_std' 则用self.meanself.std 进行正则化

NormalizeImage 类仅对图片进行处理

>>> pprint(sample)
{'curr_iter': 0,

 'flipped': True,
 
 'gt_bbox': array([[ 639.524    ,  241.79735  ,  683.641    ,  366.2275   ],
       [ 827.6553   ,  287.004    , 1065.       ,  456.85568  ],
       [   0.       ,  361.1787   ,  111.67373  ,  502.13394  ],
       [ 308.9322   ,  400.6204   ,  533.1966   ,  559.8373   ]],
      dtype=float32),
      
 'gt_class': array([[58],
		......
       [60]], dtype=int32),
       
 'h': 426.0,
 
 'im_id': array([139]),
 
 'im_shape': array([ 736., 1065.], dtype=float32),
 
 'image': array([[[-0.7650483 , -0.757703  , -1.0724183 ],
 		......
        [ 0.8618033 , -0.23249283, -0.7238344 ]]], dtype=float32),
        
 'is_crowd': array([[0],
		......
       [0]], dtype=int32),
       
 'scale_factor': array([1.7903621, 1.7861136], dtype=float32),
 
 'w': 640.0}

注意与 Decode 输出不同的是多了个 'flipped': True,因为我之前通过了 RandomFlip

在这里可能会遇到问题,看一下你的图片是 x 1 y 1 x 2 y 2 x_1y_1x_2y_2 x1y1x2y2标注的还是 x c y c x 2 y 2 x_cy_cx_2y_2 xcycx2y2还是 x c y c w h x_cy_cwh xcycwh

x1, y1, x2, y2 = sample['gt_bbox'][1].astype(int)
xx = cv2.rectangle(im, (x1, y1), (x2, y2), 255, thickness=2, lineType=8)
cv2.imwrite("xxx.png", xx)

这里有上边几种格式互相转换的函数们:
https://blog.csdn.net/HaoZiHuang/article/details/128213305

你可能感兴趣的:(PaddleDetection,python,算法,numpy)