
torchvision 提供的 VGG 模型可能没有达到论文预期


this might be because of our VGG model (I heard some reports that finetuning gives lower accuracy).
Try https://github.com/jcjohnson/pytorch-vgg 46

It’s converted the Caffe model directly into pytorch format.

These models expect different preprocessing than the other models in the PyTorch model zoo. Images should be in BGR format in the range [0, 255], and the following BGR values should then be subtracted from each pixel: [103.939, 116.779, 123.68]


Currently I’m using what https://github.com/wkentaro/pytorch-fcn 26 used, a pretrained vgg16 model in PyTorch format.

    vgg16 = VGG16(pretrained=False)
    if pretrained:
        state_dict = torch.load('./vgg16_from_caffe.pth')

But the performance is much worse than pytorch-fcn 26’s implementation.

