加载数据集设计图像预处理,PyTorch提供常见的图像处理,这些处理方式在AlexNet,ResNet,VGG等常见卷积网络中都可以见到。这里整理一下,便于使用的时候备查。
这些处理函数与可调用对象在PyTorch提供的数据集读取函数中通过transform与target_transform接口绑定使用。
PyTorch提供的数据转换功能
PyTorch提供的转换类与函数
-
在数据集读取类的构造器中,都有一个transform与 target_transform参数:
- transform实现数据集转换。
- target_transform:实现标签转换。
-
下面是一个例子
- 如果不进行格式转换DataLoader会无法识别格式,因为只支持Tensor。
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
train_mnist = MNIST(root="./datasets", train=True, download=False,transform=ToTensor())
loader = DataLoader(train_mnist, batch_size=10000, shuffle=True, drop_last=False)
for d, t in loader: # 数据与标签
print(d.shape, t.shape)
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
- PyTorch提供了哪些转换功能呢,这类专门梳理一下。
-
PIL图像转换:Transforms on PIL Image
- torchvision.transforms.CenterCrop(size)
- torchvision.transforms.ColorJitter(brightness=0, contrast=0, saturation=0, hue=0)
- torchvision.transforms.FiveCrop(size)
- torchvision.transforms.Grayscale(num_output_channels=1)
- torchvision.transforms.Pad(padding, fill=0, padding_mode='constant')
- torchvision.transforms.RandomAffine(degrees, translate=None, scale=None, shear=None, resample=False, fillcolor=0)
- torchvision.transforms.RandomCrop(size, padding=None, pad_if_needed=False, fill=0, padding_mode='constant')
- torchvision.transforms.RandomGrayscale(p=0.1)
- torchvision.transforms.RandomHorizontalFlip(p=0.5)
- torchvision.transforms.RandomPerspective(distortion_scale=0.5, p=0.5, interpolation=3)
- torchvision.transforms.RandomResizedCrop(size, scale=(0.08, 1.0), ratio=(0.75, 1.3333333333333333), interpolation=2)
- torchvision.transforms.RandomRotation(degrees, resample=False, expand=False, center=None, fill=0)
- torchvision.transforms.RandomSizedCrop(*args, **kwargs)
- torchvision.transforms.RandomVerticalFlip(p=0.5)
- torchvision.transforms.Resize(size, interpolation=2)
- torchvision.transforms.Scale(*args, **kwargs)
- torchvision.transforms.TenCrop(size, vertical_flip=False)
-
不同Tensor类型间的转换:Transforms on torch.*Tensor
- torchvision.transforms.LinearTransformation(transformation_matrix, mean_vector)
- torchvision.transforms.Normalize(mean, std, inplace=False)
- torchvision.transforms.RandomErasing(p=0.5, scale=(0.02, 0.33), ratio=(0.3, 3.3), value=0, inplace=False)
-
逆向转换:Conversion Transforms
- torchvision.transforms.ToPILImage(mode=None)
- torchvision.transforms.ToTensor()
-
通用转换:Generic Transforms
- torchvision.transforms.Lambda(lambd)
-
功能函数转换:Functional Transforms(Python的老套路,很多类最后使用函数封装调用,提供方便,自己注意与上面的对应)
- torchvision.transforms.functional.adjust_brightness(img, brightness_factor)
- torchvision.transforms.functional.adjust_contrast(img, contrast_factor)
- torchvision.transforms.functional.adjust_gamma(img, gamma, gain=1)
- torchvision.transforms.functional.adjust_hue(img, hue_factor)
- torchvision.transforms.functional.adjust_saturation(img, saturation_factor)
- torchvision.transforms.functional.affine(img, angle, translate, scale, shear, resample=0, fillcolor=None)
- torchvision.transforms.functional.center_crop(img, output_size)
- torchvision.transforms.functional.crop(img, top, left, height, width)
- torchvision.transforms.functional.erase(img, i, j, h, w, v, inplace=False)
- torchvision.transforms.functional.five_crop(img, size)
- torchvision.transforms.functional.hflip(img)
- torchvision.transforms.functional.normalize(tensor, mean, std, inplace=False)
- torchvision.transforms.functional.pad(img, padding, fill=0, padding_mode='constant')
- torchvision.transforms.functional.perspective(img, startpoints, endpoints, interpolation=3)
- torchvision.transforms.functional.resize(img, size, interpolation=2)
- torchvision.transforms.functional.resized_crop(img, top, left, height, width, size, interpolation=2)
- torchvision.transforms.functional.rotate(img, angle, resample=False, expand=False, center=None, fill=0)
- torchvision.transforms.functional.ten_crop(img, size, vertical_flip=False)
- torchvision.transforms.functional.to_grayscale(img, num_output_channels=1)
- torchvision.transforms.functional.to_pil_image(pic, mode=None)
- torchvision.transforms.functional.to_tensor(pic)
- torchvision.transforms.functional.vflip(img)
-
复合与组合转换
- torchvision.transforms.Compose(transforms)
- torchvision.transforms.RandomApply(transforms, p=0.5)
- torchvision.transforms.RandomChoice(transforms)
- torchvision.transforms.RandomOrder(transforms)
-
转换类与函数的使用
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
from torchvision.transforms.functional import to_tensor
print("-----转换对象的使用-----")
train_mnist = MNIST(root="./datasets", train=True, download=False,transform=ToTensor()) # 对象是可调用对象
loader = DataLoader(train_mnist, batch_size=10000, shuffle=True, drop_last=False)
for d, t in loader: # 数据与标签
print(d.shape, t.shape)
print("-----转换函数的使用-----")
train_mnist = MNIST(root="./datasets", train=True, download=False,transform=to_tensor)
loader = DataLoader(train_mnist, batch_size=10000, shuffle=True, drop_last=False)
for d, t in loader: # 数据与标签
print(d.shape, t.shape)
-----转换对象的使用-----
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
-----转换函数的使用-----
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
torch.Size([10000, 1, 28, 28]) torch.Size([10000])
单独使用转换
- 图像加载没有做任何处理,直接加载为PIL.Image.Image对象
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor
train_mnist = MNIST(root="./datasets", train=True, download=False)
print(type(train_mnist[0][0]))
- 使用类转换图像
- torchvision.transforms.Resize(size, interpolation=2):改变图像大小
- interpolation参数表示:图像放大缩小过程中的插值算法
- 这些对象都是可调用对象,直接使用。
- torchvision.transforms.Resize(size, interpolation=2):改变图像大小
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
from torchvision.transforms import Resize
train_mnist = MNIST(root="./datasets", train=True, download=False)
resize = Resize((32,32)) # 放大图像
img_out = resize(train_mnist[0][0])
print(type(img_out), img_out.size)
(32, 32)
- 使用函数转换图像
- 函数因为带参数,所以在图像加载的时候,不好使用函数作为转换器。
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
import torchvision.transforms.functional
train_mnist = MNIST(root="./datasets", train=True, download=False)
img_out = torchvision.transforms.functional.resize(train_mnist[0][0], size=(45, 45), interpolation=2)
print(type(img_out), img_out.size)
(45, 45)
复合转换器的使用
from torchvision.datasets import MNIST
from torch.utils.data import DataLoader
from torchvision.transforms import Resize
from torchvision.transforms import ToTensor
from torchvision.transforms import Compose
re_size = Resize((50, 50)) # 放大图像
to_tensor = ToTensor()
comp = Compose([re_size, to_tensor])
train_mnist = MNIST(root="./datasets", train=True, download=False, transform=comp)
loader = DataLoader(train_mnist, batch_size=10000, shuffle=True, drop_last=False)
for d, t in loader: # 数据与标
print(d.shape, type(d))
torch.Size([10000, 1, 50, 50])
torch.Size([10000, 1, 50, 50])
torch.Size([10000, 1, 50, 50])
torch.Size([10000, 1, 50, 50])
torch.Size([10000, 1, 50, 50])
torch.Size([10000, 1, 50, 50])
图像相关的转换使用详解
- 略(有空再补充)