吴恩达机器学习笔记(五)正则化Regularization

正则化(regularization)

过拟合问题(overfitting)

  • Underfitting(欠拟合)–>high bias(高偏差)
  • Overfitting(过拟合)–>high variance(高方差)
  • Overfitting:If we have too many features, the learned hypothesis
    may fit the training set very well , but fail to generalize to new examples (predict prices on new examples).模型泛化能力差
    addressing overfitting
    options:
    1)reduce number of features(减少特征数量)
    –Manually select which features to keep
    –Model selection algorithm(模型选择算法)
    2)regularization(正则化)
    –keep all the features but reduce magnitude/values(但减少参数的大小/值) of parameters.
    –Works well when we have a lot of features,each of which contributes a bit to predicting y.

代价函数Cost function(正则化代价函数)

the effect of penalizing two of the parameter values being large.
加入惩罚增大了两个参数带来的效果。
θ j \theta_j θj 加入惩罚项:
In regularized linear regression,we choose θ \theta θ to minimize.
Regularization线性回归代价函数:
J ( θ ) = 1 2 m [ ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) 2 + λ ∑ j = 1 m θ j 2 ] J(\theta)=\frac{1}{2m}\left[ \sum_{i=1}^{m}(h_\theta(x^{(i)})-y^{(i)})^2+\lambda\sum_{j=1}^m\theta_j^2\right] J(θ)=2m1[i=1m(hθ(x(i))y(i))2+λj=1mθj2]
目标: min ⁡ θ J ( θ ) \underset{\theta}{\min}J(\theta) θminJ(θ)
λ \lambda λ:regularization parameter(正则参数)

  • λ很大的结果?

吴恩达机器学习笔记(五)正则化Regularization_第1张图片

线性回归的正则化(Regularized linear regression)

梯度下降(Gradient descent)
吴恩达机器学习笔记(五)正则化Regularization_第2张图片梯度下降算法:
repeat: θ 0 : = θ 0 − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x 0 ( i ) \theta_0:= \theta_0-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_0^{(i)} θ0:=θ0αm1i=1m(hθ(x(i))y(i))x0(i)
θ j : = θ j − α 1 m [ ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x 0 ( i ) + λ θ j ] \theta_j:= \theta_j-\alpha\frac{1}{m}\left[\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_0^{(i)}+\lambda\theta_j\right] θj:=θjαm1[i=1m(hθ(x(i))y(i))x0(i)+λθj]
等价于:
θ 0 : = θ 0 − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x 0 ( i ) \theta_0:= \theta_0-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_0^{(i)} θ0:=θ0αm1i=1m(hθ(x(i))y(i))x0(i)
θ j : = θ j ( 1 − α 1 m ) − α 1 m ∑ i = 1 m ( h θ ( x ( i ) ) − y ( i ) ) x j ( i ) \theta_j:= \theta_j(1-\alpha\frac{1}{m})-\alpha\frac{1}{m}\sum_{i=1}^m(h_\theta(x^{(i)})-y^{(i)})x_j^{(i)} θj:=θj(1αm1)αm1i=1m(hθ(x(i))y(i))xj(i)

正规方程(Normal equation)
吴恩达机器学习笔记(五)正则化Regularization_第3张图片
正规方程:
假设: m ≤ n ( e x a m p l e s ≤ f e a t u r e s ) m\leq n(examples\leq features) mn(examplesfeatures)
θ = ( X T X ) − 1 X T y \theta=(X^TX)^{-1}X^Ty θ=(XTX)1XTy
if λ>0,
θ = ( X T X + λ [ 0 1 1 ⋱ 1 ] ⏟ ( n + 1 ) × ( n + 1 ) ) − 1 X T y \theta=\left(X^TX+\lambda\underbrace{\begin{bmatrix} 0 \\ & 1 & &&\\&&1\\&&&⋱\\&&&&1 \end{bmatrix} }_{(n+1)\times(n+1)}\right)^{-1}X^Ty θ=XTX+λ(n+1)×(n+1) 01111XTy
只要λ>0,那么括号内的矩阵一定不是奇异矩阵,也就是可逆的。
吴恩达机器学习笔记(五)正则化Regularization_第4张图片

逻辑回归的正则化(Regularization logistic regression)

逻辑回归代价函数:
J ( θ ) = − 1 m ∑ i = 1 m ( y ( i ) log ⁡ ( h θ ( x ( i ) ) ) + ( 1 − y ( i ) ) log ⁡ ( 1 − h θ ( x ( i ) ) ) ) + λ 2 m ∑ j = 1 m θ j 2 J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log(h_\theta(x^{(i)}))+(1-y^{(i)})\log(1-h_\theta(x^{(i)})))+\frac{\lambda}{2m}\sum_{j=1}^{m}\theta_j^2 J(θ)=m1i=1m(y(i)log(hθ(x(i)))+(1y(i))log(1hθ(x(i))))+2mλj=1mθj2
吴恩达机器学习笔记(五)正则化Regularization_第5张图片

吴恩达机器学习笔记(五)正则化Regularization_第6张图片

你可能感兴趣的:(机器学习,机器学习,人工智能,逻辑回归)