ML 错题集




week 2.


1.Suppose  m =4 students have taken some class, and the class had a midterm exam and a final exam. You have collected a dataset of their scores on the two exams, which is as follows:

midterm exam (midterm exam)2 final exam
89 7921 96
72 5184 74
94 8836 87
69 4761 78

You'd like to use polynomial regression to predict a student's final exam score from their midterm exam score. Concretely, suppose you want to fit a model of the form hθ(x)=θ0+θ1x1+θ2x2, where x1 is the midterm score and x2 is (midterm score)2. Further, you plan to use both feature scaling (dividing by the "max-min", or range, of a feature) and mean normalization.

What is the normalized feature x(2)2? (Hint: midterm = 72, final = 74 is training example 2.) Please round off your answer to two decimal places and enter in the text box below.


【解析】mean normalization

Replace xi with xi-μi to make fetures have approximately zero mean.Do not apply to x0=1;

均值归一化


$$ x = \dfrac{x_i -avg }{max-min} $$


avg = (7921+5184+8836+4761)/4=6675.5

answer = (5184-(6675.5))/(8836-4761)




2.Which of the following are reasons for using feature scaling?

 It speeds up gradient descent by making it require fewer iterations to get to a good solution.

【解析】Feature scaling speeds up gradient descent by avoiding many extra iterations that are required when one or more features take on much larger values than the rest.
The cost function J(θ) for linear regression has no local optima.
The magnitude of the feature values are insignificant in terms of computational cost.


你可能感兴趣的:(机器学习)