最近,在数据集处理并载入DataLoader进行训练的时候出现了问题:
RuntimeError: stack expects each tensor to be equal size,
but got [3, 200, 200] at entry 0 and [1, 200, 200] at entry 1
我看了一下,大意就是维度也就是通道数不匹配,所以我觉得应该是数据集图片出现了问题。以下是我的普通数据集处理代码:
import torch
import torchvision.transforms as transforms
from torch.utils.data import Dataset, DataLoader
import os
from PIL import Image
transform = transforms.Compose([
transforms.RandomCrop((200, 200)), #需要进行同意大小,不然会报错
transforms.ToTensor(),
])
class PreprocessDataset(Dataset):
"""预处理数据集类"""
def __init__(self, HRPath):
"""初始化预处理数据集类"""
img_names = os.listdir(HRPath)
self.HR_imgs = [HRPath + "/" + img_name for img_name in img_names]
def __len__(self):
"""获取数据长度"""
return len(self.HR_imgs)
def __getitem__(self, index):
"""获取数据"""
HR_img = self.HR_imgs[index]
HR_img = Image.open(HR_img)
HR_img = transform(HR_img)
return HR_img
if __name__ == '__main__':
HRPath = r"D:\datasets\ImageNet\train"
datasets = PreprocessDataset(HRPath)
trainData = DataLoader(datasets, batch_size=1, shuffle=False)
for i, HR_img in enumerate(trainData):
print(i, HR_img.shape)
我一张一张图片放入DataLoader,然后按顺序一张一张的查看它们的维度,然后出现如下错误:
'''结果'''
146 torch.Size([1, 3, 200, 200])
147 torch.Size([1, 3, 200, 200])
ValueError: empty range for randrange() (0,-55, -55)
我找到出错前按顺序查到了第149(从0开始)张图片的维度,点开发现这张图片的最短边小于200,不能进行随机裁剪transforms.RandomCrop((200, 200)),所以我进行了transforms.Resize(400),把图片最短边放大到400。
transform = transforms.Compose([
transforms.Resize(400),
transforms.RandomCrop((200, 200)), #需要进行同意大小,不然会报错
transforms.ToTensor(),
])
在次运行并不发生错误,但这是在DataLoader的batch_size=1张图片的情况下。所以把batch_size改成多张图片再次运行:
if __name__ == '__main__':
HRPath = r"D:\datasets\ImageNet\train"
# os.listdir(HRPath)
datasets = PreprocessDataset(HRPath)
a = datasets[89]
print(a.shape)
trainData = DataLoader(datasets, batch_size=16, shuffle=False)
for i, HR_img in enumerate(trainData):
print(i, HR_img.shape)
发生错误:
'''结果'''
0 torch.Size([16, 3, 200, 200])
1 torch.Size([16, 3, 200, 200])
2 torch.Size([16, 3, 200, 200])
3 torch.Size([16, 3, 200, 200])
4 torch.Size([16, 3, 200, 200])
RuntimeError: stack expects each tensor to be equal size,
but got [3, 200, 200] at entry 0 and [1, 200, 200] at entry 9
从不出错的结果上看,定位图片问题所在的索引应该在80-96之间,那么缩小问题图片的方位,把batch_size=2:
if __name__ == '__main__':
HRPath = r"D:\datasets\ImageNet\train"
# os.listdir(HRPath)
datasets = PreprocessDataset(HRPath)
a = datasets[89]
print(a.shape)
trainData = DataLoader(datasets, batch_size=2, shuffle=False)
for i, HR_img in enumerate(trainData):
print(i, HR_img.shape)
错误定位到第89或者第90张图片:
'''结果'''
0 torch.Size([2, 3, 200, 200])
...
...
43 torch.Size([2, 3, 200, 200])
RuntimeError: stack expects each tensor to be equal size, but got [3, 200, 200] at entry 0 and [1, 200, 200] at entry 1
输出第89张图片的维度:
if __name__ == '__main__':
HRPath = r"D:\datasets\ImageNet\train"
# os.listdir(HRPath)
datasets = PreprocessDataset(HRPath)
a = datasets[89]
print(a.shape)
结果:
torch.Size([1, 200, 200])
真的是通道数不统一,醉了啊!
解决方法,在图片预处理的时候,将所有图片都转成"RGB"三通道的模式:
HR_img = Image.open(HR_img).convert('RGB') #全部以三通道形式打开
解决完成!!!!