Softmax regression

损失函数及梯度

假设总共有m个数据,共分为k类,其中第i个数据为,则损失函数如下

其中是示性函数,当时为1,否则为0。那么对的第个分量求梯度的公式为:

梯度公式推导

\begin{align} F(x) &= - \frac{1}{m} \sum_{i=1}^{m}\sum_{l=1}^{k}\Big(I(b^{(i)}= l) \ln \Big(\frac{exp(x_l a^{(i)})}{\sum_{j=1}^{k} exp(x_j a^{(i)})}\Big)\Big) \\ &= - \frac{1}{m} \sum_{i=1}^{m}\sum_{l=1}^{k}\Big(I(b^{(i)}= l) \Big(x_la^{(i)} - \ln \sum_{j=1}^kexp(x_ja^{(i)})\Big)\Big) \\ &= -\frac{1}{m} \sum_{i=1}^{m} \Big(I(b^{(i)}= 1) \Big(x_la^{(i)} - \ln \sum_{j=1}^kexp(x_ja^{(i)})\Big) + \cdots + I(b^{(i)}= k) \Big(x_la^{(i)} - \ln \sum_{j=1}^kexp(x_ja^{(i)})\Big)\Big) \end{align}

  1. 如果此时要求,那么把与有关的项提出来
    \begin{align} \frac{\partial F(x)}{\partial x_1} &= \frac{\partial}{\partial x_1}\Big(-\frac{1}{m} \sum_{i=1}^{m} \Big(I(b^{(i)}= 1) \Big(x_1a^{(i)} - \ln \sum_{j=1}^kexp(x_ja^{(i)})\Big) + \cdots + I(b^{(i)}= k) \Big( - \ln \sum_{j=1}^kexp(x_ja^{(i)})\Big)\Big)\Big) \\ &= \frac{\partial}{\partial x_1}\Big(-\frac{1}{m}\sum_{i=1}^{m} \Big(I(b^{(i)} = 1) x_1a^{(i)} - (I(b^{(i)} = 1) + \cdots + I(b^{(i)} = k))\ln \sum_{j=1}^kexp(x_ja^{(i)})\Big) \Big) \\ &= \frac{\partial}{\partial x_1}\Big(-\frac{1}{m}\sum_{i=1}^{m}\Big(I(b^{(i)} = 1) x_1a^{(i)} - \ln \sum_{j=1}^kexp(x_ja^{(i)})\Big) \Big) \\ & = - \frac{1}{m} \sum_{i=1}^m \Big[\Big(I(b^{(i)} = 1) - \frac{exp(x_l^T a^{(i)})}{\sum_{j=1}^{k} exp(x_j^T a^{(i)})}\Big) (a ^{(i)})^T\Big] \end{align}
    其中倒数第二个等式是因为

你可能感兴趣的:(Softmax regression)