初识Grad-CAM

基于论文:Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization

论文下载地址:https://arxiv.org/abs/1610.02391

Pytorch代码下载地址:https://github.com/jacobgil/pytorch-grad-cam

目录

Pytorch-Grad-CAM原理介绍

使用Grad-CAM绘制热力图


Pytorch-Grad-CAM原理介绍

初识Grad-CAM_第1张图片

以图像分类为例:

Activations:正向传播

Gradients:反向传播

A:从原图像中提取的特征层越往后抽象程度越高,语义信息越丰富,故在图像分类任务中我们取特征层的最后一层,即Features[-1]

在经过两个全连接层后得到损失yc,经过backpropagation后得到的彩色表示层表示的是A中相应层对图像的重要程度,对矩阵均值后加权求和然后经过ReLU激活函数激活,显色后得到热力图。

更多理论介绍可参见这篇blog,讲的很详细:https://blog.csdn.net/qq_37541097/article/details/123089851

使用Grad-CAM绘制热力图

从本文开头论文下载地址下载代码,我们来看main文件:

import os
import numpy as np
import torch
from PIL import Image
import matplotlib.pyplot as plt
from torchvision import models
from torchvision import transforms
from utils import GradCAM, show_cam_on_image, center_crop_img


def main():
    model = models.mobilenet_v3_large(pretrained=True)
    target_layers = [model.features[-1]]   #层结构列表

    # model = models.vgg16(pretrained=True)
    # target_layers = [model.features]

    # model = models.resnet34(pretrained=True)
    # target_layers = [model.layer4]

    # model = models.regnet_y_800mf(pretrained=True)
    # target_layers = [model.trunk_output]

    # model = models.efficientnet_b0(pretrained=True)
    # target_layers = [model.features]

    data_transform = transforms.Compose([transforms.ToTensor(),
                                         transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])
    # load image
    img_path = "both.png"
    assert os.path.exists(img_path), "file: '{}' dose not exist.".format(img_path)
    img = Image.open(img_path).convert('RGB')
    img = np.array(img, dtype=np.uint8)
    # img = center_crop_img(img, 224)

    # [C, H, W]
    img_tensor = data_transform(img)
    # expand batch dimension
    # [C, H, W] -> [N, C, H, W]
    input_tensor = torch.unsqueeze(img_tensor, dim=0)   #增加Batch维度
    # 初始化CAM对象,包括模型,目标层以及是否使用cuda等
    cam = GradCAM(model=model, target_layers=target_layers, use_cuda=True)
    # target_category = 281  # tabby, tabby cat
    # target_category = 673  # mouse, computer mouse
    # target_category = 657  # missile
    target_category = 254  # pug, pug-dog

    grayscale_cam = cam(input_tensor=input_tensor, target_category=target_category)

    grayscale_cam = grayscale_cam[0, :]
    visualization = show_cam_on_image(img.astype(dtype=np.float32) / 255.,
                                      grayscale_cam,
                                      use_rgb=True)
    plt.imshow(visualization)
    plt.show()


if __name__ == '__main__':
    main()

 运行结果如下:

初识Grad-CAM_第2张图片

值得一提的是:在作者提供的 imagenet1k_classes.txt文件中提供了1000种不同识别特征,在代码的

target_category=

等号后面改成相应[行号-1]即可。

将target_category改成tabby cat对应的行号-1=281,输出结果为:

初识Grad-CAM_第3张图片

 

我们来识别鼠标试试(category=678):

初识Grad-CAM_第4张图片

识别导弹热力图:

初识Grad-CAM_第5张图片

 

你可能感兴趣的:(深度学习,python,pytorch,人工智能,神经网络)