好久没写了…在因为第九周的课程逾期而疯狂补课之后,我又开始写周总结了,这一周的课程和作业并不难,只是collaborate filtering的概念我还是感觉很迷。废话不多说,这周作业里的注意的点有两个:
实际上和linear regression非常相似,只不过在Collaborate Filtering中,要同时学习 x ( 1 ) . . . x ( n m ) x^{(1)}...x^{(n_m)} x(1)...x(nm)(特征)和 θ ( 1 ) . . . θ ( n u ) \theta^{(1)}...\theta^{(n_u)} θ(1)...θ(nu)(参数);并且为了方便,将系数从原来的 1 2 m \frac{1}{2m} 2m1简化成了 1 2 \frac{1}{2} 21。
代价函数的形式为:
J ( θ ( 1 ) , . . . , θ ( n u ) , x ( 1 ) , . . . , x ( n m ) ) = 1 2 ∑ i , j : r ( i , j ) = 1 ( X n m × n ( Θ n u × n ) T − Y n m × n u ) 2 + λ 2 ∑ i ( x ( i ) ) 2 + λ 2 ∑ j ( θ ( j ) ) 2 J(\theta^{(1)}, ..., \theta^{(n_u)}, x^{(1)}, ..., x^{(n_m)}) = \frac{1}{2}\sum_{i,j:r(i,j)=1}{(X_{n_m\times n}(\Theta_{n_u\times n})^T-Y_{n_m\times n_u})^2} + \frac{\lambda}{2}\sum_i{(x^{(i)})^2} + \frac{\lambda}{2}\sum_j{(\theta^{(j)})^2} J(θ(1),...,θ(nu),x(1),...,x(nm))=21i,j:r(i,j)=1∑(Xnm×n(Θnu×n)T−Ynm×nu)2+2λi∑(x(i))2+2λj∑(θ(j))2
对每个参数求导,可得:
∂ J ∂ θ k ( j ) = ∑ i : r ( i , j ) = 1 x k ( i ) ( ( θ ( j ) ) T x ( i ) − y ( i , j ) ) + λ θ k ( j ) \frac{\partial J}{\partial \theta^{(j)}_k} = \sum_{i:r(i,j)=1}{x^{(i)}_k((\theta^{(j)})^Tx^{(i)}-y(i,j))} + \lambda\theta_k^{(j)} ∂θk(j)∂J=i:r(i,j)=1∑xk(i)((θ(j))Tx(i)−y(i,j))+λθk(j)
∂ J ∂ x k ( i ) = ∑ j : r ( i , j ) = 1 ( ( θ ( j ) ) T x ( i ) − y ( i , j ) ) + λ x k ( i ) \frac{\partial J}{\partial x^{(i)}_k} = \sum_{j:r(i,j)=1}{((\theta^{(j)})^Tx^{(i)}-y(i,j))} + \lambda x_k^{(i)} ∂xk(i)∂J=j:r(i,j)=1∑((θ(j))Tx(i)−y(i,j))+λxk(i)
用Matlab实现如下:
prediction = X * Theta';
J = 1/2 * sum(sum((prediction.*(R == 1) - Y.*(R == 1)).^2))...
+ lambda/2 * sum(X(:).^2)...
+ lambda/2 * sum(Theta(:).^2);
Theta_grad = (prediction.*(R == 1) - Y.*(R == 1))' * X + lambda * Theta;
X_grad = (prediction.*(R == 1) - Y.*(R == 1)) * Theta + lambda * X;
ex8.m中求高斯分布的参数时,涉及到方差的计算,作业要求方差前面的系数是 1 m \frac{1}{m} m1,而var(x)的默认系数是 1 m − 1 \frac{1}{m-1} m−11。
var(x) % 或者var(x, 0),默认,系数为1/(m-1)
var(x, 1) % 系数为1/m