transforms.py相当于一个工具箱,里面有很多工具,比如totensor(将数据转换为tensor类型)、resize等。这个工具箱的输入是图片。
from PIL import Image
from torchvision import transforms
# 绝对路径:/home/xjy/PycharmProjects/pythonProject/dataset/train/ants/0013035.jpg
# 相对路径:dataset/train/ants/0013035.jpg
img_path = "dataset/train/ants/0013035.jpg"
img = Image.open(img_path)
print(img) # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=768x512 at 0x7FE4A77C61D0>
tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img)
print(tensor_img) # tensor([[...]])
首先我们需要创建一个具体的工具,如transforms.ToTensor()
,然后我们需要去使用这个工具,将输入转换为输出,result = tool(input)
,即上面的tensor_img = tensor_trans(img)
PS. 使用opencv的代码:
import cv2
cv_img = cv2.imread(img_path) # 为ndarray格式
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
# 绝对路径:/home/xjy/PycharmProjects/pythonProject/dataset/train/ants/0013035.jpg
# 相对路径:dataset/train/ants/0013035.jpg
img_path = "dataset/train/ants/0013035.jpg"
img = Image.open(img_path)
print(img) # <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=768x512 at 0x7FE4A77C61D0>
tensor_trans = transforms.ToTensor()
tensor_img = tensor_trans(img)
print(tensor_img) # tensor([[...]])
writer = SummaryWriter("logs") # save_dir : logs
writer.add_image("Tensor_img", tensor_img)
writer.close()
运行后在终端输入tensorboard --logdir=logs --port=6007
即可显示图片
需要去关注输入、输出、作用。不同的函数会生成不同的数据类型,如
Image.open() | PIL |
---|---|
ToTensor() | tensor |
cv.imread() | narrays |
首先,回顾一下类的用法:
class Person:
def __call__(self, name):
print("__call__" + " Hello " + name)
def hello(self, name):
print("hello " + name)
person = Person()
person("Zhangsan") # __call__ Hello Zhangsan
person.hello("lisi") # hello lisi
接着看一下ToTensor()的用法:
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
writer = SummaryWriter("logs")
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)
writer.add_image("ToTensor", img_tensor)
writer.close()
因为add_image()函数要求输入torch.Tensor, numpy.array, or string/blobname的图片,所以需要先将img转换为tensor类型。
作用:将tensor或ndarray数据类型转换为PIL image类型
作用:归一化一个tensor image,其公式为 result[channel] = (input[channel] - mean[channel]) / std[channel],那么如果input的范围为[0,1],将mean和std均设置为0.5,那么result的范围就会为[-1,1]。
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
writer = SummaryWriter("logs")
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
# ToTensor
trans_totensor = transforms.ToTensor()
img_tensor = trans_totensor(img)
writer.add_image("ToTensor", img_tensor)
# Normalize
# print(img_tensor[0][0][0])
trans_norm = transforms.Normalize([0.5, 0.5, 0.5], [0.5, 0.5, 0.5]) # mean, std
img_norm = trans_norm(img_tensor)
# print(img_norm[0][0][0])
writer.add_image("Normalize", img_norm)
writer.close()
显示的结果如图:
作用:将输入的PIL图片resize成给定的尺寸,输出仍为PIL image数据类型
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
writer = SummaryWriter("logs")
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
# Resize
print(img.size) # (500, 375)
trans_resize = transforms.Resize((512, 512))
img_resize = trans_resize(img)
print(img_resize) # <PIL.Image.Image image mode=RGB size=512x512 at 0x7FAB69DFBE10>
# 显示
trans_totensor = transforms.ToTensor()
img_resize = trans_totensor(img_resize)
print(img_resize)
writer.add_image("Resize", img_resize)
writer.close()
作用:transforms.Compose([trans_resize_2, trans_totensor])
,其输入为PIL image,输出tensor
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
writer = SummaryWriter("logs")
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
# ToTensor
trans_totensor = transforms.ToTensor()
# Compose - resize - 2
trans_resize_2 = transforms.Resize(512) # 等比缩放
trans_compose = transforms.Compose([trans_resize_2, trans_totensor])
img_resize_2 = trans_compose(img)
writer.add_image("Resize", img_resize_2, 1)
writer.close()
作用:随机裁剪
from PIL import Image
from torch.utils.tensorboard import SummaryWriter
from torchvision import transforms
writer = SummaryWriter("logs")
img = Image.open("images/pink.jpg")
print(img)
# <PIL.JpegImagePlugin.JpegImageFile image mode=RGB size=500x375 at 0x7FC0B136F240>
# ToTensor
trans_totensor = transforms.ToTensor()
# RandomCrop()
trans_random = transforms.RandomCrop(256)
# trans_random = transforms.RandomCrop((256, 300))
trans_compose_2 = transforms.Compose([trans_random, trans_totensor])
for i in range(10):
img_crop = trans_compose_2(img)
writer.add_image("RandomCrop", img_crop, i)
writer.close()
print()
或者print(type())
或者debug
获取。