pytorch——常见的Transforms

        transforms相当于一个工具箱,里面的类(class)totensor、resize等相当于工具。图片经过工具的处理,就会得到我们想要的结果。

        首先我们查看transforms的源代码中的totensor函数:

class ToTensor:
    """Convert a ``PIL Image`` or ``numpy.ndarray`` to tensor. This transform does not support torchscript.

def __call__(self, pic):
    """
    Args:
        pic (PIL Image or numpy.ndarray): Image to be converted to tensor.

    Returns:
        Tensor: Converted image.
    """
    return F.to_tensor(pic)

def __repr__(self):
    return self.__class__.__name__ + '()'

        可以知道该函数可以将传入的PIL Image和ndarray类型转换为tensor类型数据,下面进行实际操作:

from PIL import Image
from torchvision import transforms

img_path = "data/train/ants_image/67270775_e9fdf77e9d.jpg"#传入相对路径
img = Image.open(img_path)#传入图片的地址

        这里的img是PIL.JpegImage类型,接着进行转换:

trans_tensor = transforms.ToTensor()#创建一个对象
img_tensor = trans_tensor(img)

print(img_tensor)

        根据打印结果知道,转换为了tensor类型,我们查看tensor的属性:

pytorch——常见的Transforms_第1张图片

可以发现有很多是神经网络用到的参数,所以我们采用transform将数据转换为tensor类型,便于我们之后的神经网络的使用。


        上面说的是将PIL Image类型转换为tensor类型,我们还可以用opencv读取图片为numpy.ndarray:

pytorch——常见的Transforms_第2张图片

可以看到图片读为ndarray类型 ,可以将该类型转换为tensor类型。


        接着我们将读取的照片可视化:

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

img_path = "data/train/ants_image/67270775_e9fdf77e9d.jpg"#传入相对路径
img = Image.open(img_path)#传入图片的地址
writer = SummaryWriter("logs")

trans_tensor = transforms.ToTensor()#创建一个对象
img_tensor = trans_tensor(img)
writer.add_image("图片",img_tensor)

writer.close()

运行结束后,打开TensorBoard窗口,可以看到:

pytorch——常见的Transforms_第3张图片

将torch.Tensor类型的数据进行了显示。


学习几个常用的transforms:

Normalize(归一化):

class Normalize(torch.nn.Module):
    """Normalize a tensor image with mean and standard deviation.
    This transform does not support PIL Image.
    Given mean: ``(mean[1],...,mean[n])`` and std: ``(std[1],..,std[n])`` for ``n``
    channels, this transform will normalize each channel of the input
    ``torch.*Tensor`` i.e.,
    ``output[channel] = (input[channel] - mean[channel]) / std[channel]``

注意到,只能对tensor类进行操作,因为我们的图片是RGB三个通道,所以我们需要同时设置三个数。

from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms

trans_totensor = transforms.ToTensor()
writer = SummaryWriter("logs")
img = Image.open("data/train/ants_image/20935278_9190345f6b.jpg")

#Totensor使用
img_tensor = trans_totensor(img)#转换为了tensor型
writer.add_image("img_tensor",img_tensor,0)

#Normalize使用
trans_norm = transforms.Normalize([0.5,0.5,0.5],[0.5,0.5,0.5])#因为是RGB型
img_norm = trans_norm(img_tensor)
writer.add_image("img_norm", img_norm, 0)

writer.close()

打开Tensorboard可以看到:

pytorch——常见的Transforms_第4张图片

 Resize的使用:

class Resize(torch.nn.Module):
    """Resize the input image to the given size.
    If the image is torch Tensor, it is expected
    to have [..., H, W] shape, where ... means an arbitrary number of leading dimensions
Args:
    size (sequence or int): Desired output size. If size is a sequence like
        (h, w), output size will be matched to this. If size is an int,
        smaller edge of the image will be matched to this number.
        i.e, if height > width, then image will be rescaled to
        (size * height / width, size).
def __init__(self, size, interpolation=InterpolationMode.BILINEAR, max_size=None, antialias=None):
        ...
def forward(self, img):
        ...

如果只有一个参数,则相当于等比缩放。

print(img)
trans_size = transforms.Resize((520, 520))
img_resize = trans_size(img)
print(img_resize)

打印结果为:

可以知道size已经改变。

 compose的使用:

class Compose:
    """Composes several transforms together. This transform does not support torchscript.
    Please, see the note below.
    Args:
        transforms (list of ``Transform`` objects): list of transforms to compose.
    Example:
        >>> transforms.Compose([
        >>>     transforms.CenterCrop(10),
        >>>     transforms.PILToTensor(),
        >>>     transforms.ConvertImageDtype(torch.float),
        >>> ])

        该函数的使用需要是transforms组成的列表,我们将图片先进行大小转换,接着转换为tensor类型,然后使用Tensorboard进行显示:

# compose
trans_size_2 = transforms.Resize(520)
tran_com = transforms.Compose([trans_size_2, trans_totensor])
img_com = tran_com(img)
writer.add_image("compose", img_com)

compose中的函数是按顺序进行的。

你可能感兴趣的:(Python,机器学习,python,深度学习,开发语言)