torchvision.transforms模块是PyTorch进行图片预处理的模块。
对图像进行处理的第一步就是读取图片。一般来说,图片读入后以numpy.ndarray格式和PILImage方式。这里简单介绍几种图片的读取方式。
PIL通过Image模块读入图片。
from PIL import Image
dir_path = r"C:\Users\用户名\Pictures\test.jpg"
img_plt = Image.open(dir_path)
>>>print(img_plt)
output: <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=4000x2250 at 0x20319DB3240>
>>>plt.imshow(img_plt)
import matplotlib.pyplot as plt
dir_path = r"C:\Users\用户名\Pictures\test.jpg"
img_plt = plt.imread(dir_path)
>>>print(type(img_plt))
output: <class 'numpy.ndarray'>
>>>plt.imshow(img_plt)
class torchvision.transforms.CenterCrop(size)
’img_trans = transforms.CenterCrop((3000, 4000))(img_plt)
plt.imshow(img_trans)
2. class torchvision.transforms.RandomCrop(size, padding=0)
切割中心点的位置随机选取。size 可以是tuple也可以是Integer。size大小不能超过图片尺寸。
img_trans = transforms.RandomCrop((300, 400))(img_plt)
plt.imshow(img_trans)
3. class torchvision.transforms.RandomHorizontalFlip
随机水平翻转给定的PIL.Image,概率为0.5。即:一半的概率翻转,一半的概率不翻转。
img_trans = transforms.RandomHorizontalFlip(0.5)(img_plt)
plt.imshow(img_trans)
4. class torchvision.transforms.RandomSizedCrop(size, interpolation=2)
先将给定的 PIL.Image 随机切,然后再resize成给定的size大小。
img_trans = transforms.RandomSizedCrop((200, 300))(img_plt)
plt.imshow(img_trans)
5. class torchvision.transforms.Pad(padding, fill=0)
将给定的PIL.Image的所有边用给定的pad value填充。 padding:要填充多少像素 fill:用什么值填充.
img_trans = transforms.Pad(padding=50, fill=(150, 150, 0))(img_plt)
plt.imshow(img_trans)
class torchvision.transforms.ToTensor
transforms.ToTensor()(img_trans)
output:tensor([[[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
...,
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882]],
[[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
...,
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882],
[0.5882, 0.5882, 0.5882, ..., 0.5882, 0.5882, 0.5882]],
[[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
...,
[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, ..., 0.0000, 0.0000, 0.0000]]])
class torchvision.transforms.Normalize(mean, std)