这部分的任务需要我们计算图像的显著图
但是这部分在2017版本的cs231n好像并没有讲到,因此需要自己学习
这里有一部分对于显著图的讲解
下面的截图是我从上面链接中的文章里摘出来的
def compute_saliency_maps(X, y, model):
"""
Compute a class saliency map using the model for images X and labels y.
Input:
- X: Input images; Tensor of shape (N, 3, H, W)
- y: Labels for X; LongTensor of shape (N,)
- model: A pretrained CNN that will be used to compute the saliency map.
Returns:
- saliency: A Tensor of shape (N, H, W) giving the saliency maps for the input
images.
"""
# Make sure the model is in "test" mode
model.eval()
# Make input tensor require gradient
X.requires_grad_()
saliency = None
##############################################################################
# TODO: Implement this function. Perform a forward and backward pass through #
# the model to compute the gradient of the correct class score with respect #
# to each input image. You first want to compute the loss over the correct #
# scores (we'll combine losses across a batch by summing), and then compute #
# the gradients with a backward pass. #
##############################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
# 前向传播
scores = model(X)
# 得到正确分类
correct_scores = scores[range(len(y)), y]
# 计算损失(正确分类的) 上面说了,这里是求和
loss = correct_scores.sum()
# 反向传播
loss.backward()
# 求三个通道的梯度的绝对值的最大值
saliency, _ = torch.max(torch.abs(X.grad.data), dim=1)
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
##############################################################################
# END OF YOUR CODE #
##############################################################################
return saliency
就是给我们一张图片,他是类似于target_x的一张图片,但是我们想通过添加噪声的方式,使他最终会被模型分类成target_y
def make_fooling_image(X, target_y, model):
"""
Generate a fooling image that is close to X, but that the model classifies
as target_y.
Inputs:
- X: Input image; Tensor of shape (1, 3, 224, 224)
- target_y: An integer in the range [0, 1000)
- model: A pretrained CNN
Returns:
- X_fooling: An image that is close to X, but that is classifed as target_y
by the model.
"""
# Initialize our fooling image to the input image, and make it require gradient
X_fooling = X.clone()
X_fooling = X_fooling.requires_grad_()
learning_rate = 1
##############################################################################
# TODO: Generate a fooling image X_fooling that the model will classify as #
# the class target_y. You should perform gradient ascent on the score of the #
# target class, stopping when the model is fooled. #
# When computing an update step, first normalize the gradient: #
# dX = learning_rate * g / ||g||_2 #
# #
# You should write a training loop. #
# #
# HINT: For most examples, you should be able to generate a fooling image #
# in fewer than 100 iterations of gradient ascent. #
# You can print your progress over iterations to check your algorithm. #
##############################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
for i in range(100):
# 计算分数
scores = model(X_fooling)
# 得到分数最大的类别
_, pred_y = scores.max(1)
# 如果预测的类别和目标类别相同,说明已经欺骗成功
if pred_y == target_y:
break
# 计算损失
loss = scores[0, target_y]
# 反向传播
loss.backward()
# 梯度上升
X_fooling.data += learning_rate * X_fooling.grad.data / torch.norm(X_fooling.grad.data)
# 清空梯度,否则会梯度会累加
X_fooling.grad.data.zero_()
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
##############################################################################
# END OF YOUR CODE #
##############################################################################
return X_fooling
该函数受前置函数create_class_visualization调用,建议看一下那个函数加深一下理解
这个函数的作用是
使用该模型计算类target_y的分数相对于图像像素的梯度,并使用学习率在图像上进行梯度步进。
这个没啥好说的,看题面和代码注释就好了
def class_visualization_update_step(img, model, target_y, l2_reg, learning_rate):
########################################################################
# TODO: Use the model to compute the gradient of the score for the #
# class target_y with respect to the pixels of the image, and make a #
# gradient step on the image using the learning rate. Don't forget the #
# L2 regularization term! #
# Be very careful about the signs of elements in your code. #
########################################################################
# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
# 计算分数
scores = model(img)
# 计算损失
loss = scores[0, target_y] - l2_reg * torch.sum(img ** 2)
# 反向传播
loss.backward()
# 梯度上升
img.data += learning_rate * img.grad.data / torch.norm(img.grad.data)
# 清空梯度,否则会梯度会累加
img.grad.data.zero_()
# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
########################################################################
# END OF YOUR CODE #
########################################################################
这篇任务给我感觉就是把输入图像当做学习的参数,然后用一个已经训练好参数的模型来生成图像,就是颠倒了输入参数和训练参数的感觉,有点生成图像的感觉。。。。个人见解