cs231n assignment(1.3):softmax分类器

cs231n assignment(1.3):softmax分类器

softmax分类器的练习宗旨在于:

  • 实现全向量模式的Softmax分类器的损失函数
  • 实现其损失函数的全向量模式的解析梯度
  • 利用数值检测来检测结果
  • 使用验证集来调整学习率和正则强度
  • 利用SGD优化损失函数
  • 可视化最终学习到的权重矩阵

1.数据准备与预处理

这里的数据准备与预处理阶段与上一篇svm分类器中的准备与预处理阶段相同,所以略去只写结果:

Train data shape:  (49000, 3073)
Train labels shape:  (49000,)
Validation data shape:  (1000, 3073)
Validation labels shape:  (1000,)
Test data shape:  (1000, 3073)
Test labels shape:  (1000,)
dev data shape:  (500, 3073)
dev labels shape:  (500,)

2.Softmax分类器

Softmax函数为

P(yi|xi,W)=efyijefj

所以其损失函数为:
Li=log(esyijesj)

整体损失函数与上次相同:
L=1Ni=1NLi+R(W)

同上次一样,作业也要求实现有循环和无循环纯向量的两种softmax分类器。
这里记几个点:
1.记得要给得到的分数( f=XW )减去每行中最大的值,至于为什么可以去看比价
2.向量写法本质上就是帮助缩减了相乘相加的过程,所以书写完naive的代码,哪里有循环然后相乘相加的地方,就可以用向量代码实现
分别贴我自己的代码:
naive版本:

def softmax_loss_naive(W, X, y, reg):
  """
  Softmax loss function, naive implementation (with loops)

  Inputs have dimension D, there are C classes, and we operate on minibatches
  of N examples.

  Inputs:
  - W: A numpy array of shape (D, C) containing weights.
  - X: A numpy array of shape (N, D) containing a minibatch of data.
  - y: A numpy array of shape (N,) containing training labels; y[i] = c means
    that X[i] has label c, where 0 <= c < C.
  - reg: (float) regularization strength

  Returns a tuple of:
  - loss as single float
  - gradient with respect to weights W; an array of same shape as W
  """
  # Initialize the loss and gradient to zero.
  loss = 0.0
  dW = np.zeros_like(W)

  num_train = X.shape[0]
  num_classes = W.shape[1]
  f = X.dot(W)
  f_max = np.max(f,axis=1,keepdims = True)
  f_scores = f - f_max
  prob = np.exp(f_scores)/np.sum(np.exp(f_scores),axis = 1, keepdims = True)

  #loss
  for i in xrange(num_train):
       loss += -np.log(prob[i,y[i]])
  loss /= num_train
  loss += 0.5 * reg * np.sum(W*W)

  #dW
  for i in xrange(num_train):
      for j in xrange(num_classes):
            if j == y[i]:
                dW[:,j] += (prob[i,j] - 1) * X[i,:]
            else:
                dW[:,j] += prob[i,j] * X[i,:]
  dW /= num_train
  dW += reg * W  

  return loss, dW

vectorized版本:

def softmax_loss_vectorized(W, X, y, reg):

  loss = 0.0
  dW = np.zeros_like(W)

  num_train = X.shape[0]  
  f = X.dot(W)
  f_max = np.max(f,axis = 1, keepdims = True)
  f_scores = f - f_max
  prob = np.exp(f_scores)/np.sum(np.exp(f_scores),axis = 1, keepdims = True)
  #loss
  loss = np.sum(-np.log(prob[np.arange(num_train),y[np.arange(num_train)]]))
  loss /= num_train
  loss += 0.5 * reg * np.sum(W*W)
  #dW
  prob[np.arange(num_train),y[np.arange(num_train)]] -= 1
  dW = X.T.dot(prob) 
  dW = dW / num_train + reg * W

  return loss, dW

3.调整超参数、可视化最终的结果
笔者在验证集上做到了35.2%,测试集32.9%。
然后是权重矩阵的图像
cs231n assignment(1.3):softmax分类器_第1张图片

和SVM还是很类似的

你可能感兴趣的:(人工智能,python,cs231n)