语义分割—交叉熵损失函数

from dataloaders import utils
criterion = utils.cross_entropy2d
 
import torch.nn as nn
def cross_entropy2d(logit, target, ignore_index=255, weight=None, size_average=True, batch_average=True):
    n, c, h, w = logit.size()
    # logit = logit.permute(0, 2, 3, 1)
    target = target.squeeze(1)
    if weight is None:
        criterion = nn.CrossEntropyLoss(weight=weight, ignore_index=ignore_index, size_average=False)
    else:
        criterion = nn.CrossEntropyLoss(weight=torch.from_numpy(np.array(weight)).float().cuda(), ignore_index=ignore_index, size_average=False)
    loss = criterion(logit, target.long())
 
    if size_average:
        loss /= (h * w)
 
    if batch_average:
        loss /= n
 
    return loss

最近发现一个问题, 就是语义分割target这一步有的代码会将target.squeeze,  有的不需要呢   我测试了一下  如果标签单通道的灰度图的话  Image.open()读取是二维, 然后经过DataLoader, 就是三维(batch, w, h),  不懂为啥要squeeze????

原来在有的代码有的在transform中有的ToTensor中会有一步扩维操作:

class ToTensor(object):
    """Convert ndarrays in sample to Tensors."""

    def __call__(self, sample):
        # swap color axis because
        # numpy image: H x W x C
        # torch image: C X H X W
        img = np.array(sample['image']).astype(np.float32).transpose((2, 0, 1))
        mask = np.expand_dims(np.array(sample['label']).astype(np.float32), -1).transpose((2, 0, 1))
        mask[mask == 255] = 0

        img = torch.from_numpy(img).float()
        mask = torch.from_numpy(mask).float()

        return {'image': img,
                'label': mask}

mask = np.expand_dims(np.array(sample['label']).astype(np.float32), -1).transpose((2, 0, 1))

还有两点提醒自己:1.交叉函数内的target必须为long类型,2.交叉函数中target必须为三维(batch, w, h)

你可能感兴趣的:(Pytorch学习)