机器学习测试Week3_1Logistic Regression

Week3_1Logistic Regression

第 1 题

Suppose that you have trained a logistic regression classifier, and it outputs on a new example x a prediction hθ(x) = 0.4. This means (check all that apply):

  • Our estimate for P(y=0|x;θ) is 0.4.
  • Our estimate for P(y=1|x;θ) is 0.6.
  • Our estimate for P(y=0|x;θ) is 0.6.
  • Our estimate for P(y=1|x;θ) is 0.4.
    *     答案: 3 4 *
        解析: hθ(x) will give us the probability that our output is 1. 0.4是y=1时的概率

第 2 题

Suppose you have the following training set, and fit a logistic regression classifier hθ(x)=g(θ0+θ1x1+θ2x2)
Which of the following are true? Check all that apply.

  • Adding polynomial features (e.g., instead using hθ(x)=g(θ0+θ1x1+θ2x2+θ3x21+θ4x1x2+θ5x22) ) could increase how well we can fit the training data.
  • At the optimal value of θ (e.g., found by fminunc), we will have J(θ)≥0.
  • Adding polynomial features (e.g., instead using hθ(x)=g(θ0+θ1x1+θ2x2+θ3x21+θ4x1x2+θ5x22) ) would increase J(θ) because we are now summing over more terms.
  • If we train gradient descent for enough iterations, for some examples x(i) in the training set it is possible to obtain hθ(x(i))>1 .
    *     答案: 1 2 *
    * 当有一个feature时是一条直线,当有两个feature时一条曲线,有更多的feature时是一条弯七弯八的曲线 *
    * 当feature越来越多时,曲线越来越拟合,即损失函数越来越小 *
    * 选项1: 当增加feature时,拟合的更好. 正确 **
    * 选项2: 找到最佳的 θ , J(θ) 有可能为0,但一般情况下会大于0. 正确 **
    * 选项3: 跟1正好相反. 不正确 **
    * 选项4: 0<hθ(x(i))<1 的0到1之间不可能大于1. 不正确 **

第 3 题

第 3 个问题
For logistic regression, the gradient is given by θjJ(θ)=1mmi=1(hθ(x(i))y(i))x(i)j . Which of these is a correct gradient descent update for logistic regression with a learning rate of α? Check all that apply.

  • θ:=θα1mmi=1(θTxy(i))x(i)
  • θj:=θjα1mmi=1(11+eθTx(i)y(i))x(i)j (simultaneously update for all j ).
  • θj:=θjα1mmi=1(hθ(x(i))y(i))x(i) (simultaneously update for all j ).
  • θj:=θjα1mmi=1(hθ(x(i))y(i))x(i)j (simultaneously update for all j ).

*     答案: 2 4 *
* 选项1: θTx ,是线性回归的. 不正确 **
* 选项2: 正确 **
* 选项3: 与4的区别是 x(i) x(i)j ,不明白的话需要看一下推导过程. 不正确 **
* 选项4: 正确 **

第 4 题

Which of the following statements are true? Check all that apply.

  • For logistic regression, sometimes gradient descent will converge to a local minimum (and fail to find the global minimum). This is the reason we prefer more advanced optimization algorithms such as fminunc (conjugate gradient/BFGS/L-BFGS/etc).
  • The sigmoid function g(z)=11+ez is never greater than one (>1).
  • The cost function J(θ) for logistic regression trained with m≥1 examples is always greater than or equal to zero.
  • Linear regression always works well for classification if you classify by using a threshold on the prediction made by linear regression.
    *     答案: 2 3 *
    * 选项1: 梯度下降法是能找到全局最小值的,因为损失函数是一个凸函数.用更高级的算法的目的是”no need to pick α ”,同时更快速的找到全局最小值 **
    * 选项2: sigmoid函数的取值范围是(0,1). 正确 **
    * 选项3: costFunction 大于等于0. 正确 **
    * 选项4: 分类问题,要么0, 要么1, 没有什么threshold一说 **

第 5 题

第 5 个问题
Suppose you train a logistic classifier hθ(x)=g(θ0+θ1x1+θ2x2) . Suppose θ0=6,θ1=−1,θ2=0. Which of the following figures represents the decision boundary found by your classifier?

*     答案: 刷了几次没有选项,只有题干,随便蒙了一个竟然对了 *
