一、理论基础
1. 评估一个假设
为了检验算法是否过拟合,我们将数据分成训练集和测试集,通常用 70%的数据作为训练集,用剩下 30%的数据作为测试集(训练集和测试集均要含有各种类型的数据,所以要对数据进行“洗牌”,然后再分成训练集和测试集)。
2. 模型选择和交叉验证集
使用交叉验证集来择一个更能适应一般情况的模型,即:使用 60%的数据作为训练集,使用 20%的数据作为交叉验证集,使用 20%的数据作为测试集
模型选择的方法为:
function [J, grad] = linearRegCostFunction(X, y, theta, lambda)
% Initialize some useful values
m = length(y); % number of training examples
% You need to return the following variables correctly
J = 0;
grad = zeros(size(theta));
J=1/(2*m)*(X*theta-y)'*(X*theta-y)+lambda/(2*m)*(theta'*theta-theta(1)^2);
grad = 1/m*(X'*(X*theta-y));
grad(2:end)=grad(2:end)+(lambda/m)*theta(2:end);
grad = grad(:);
end
2.learningCurve.m(Generates a learning curve):implement code to generate the learning curves that will be useful in
debugging learning algorithms.
training error for a dataset:
function [error_train, error_val] = ...
learningCurve(X, y, Xval, yval, lambda)
% Number of training examples
m = size(X, 1);
% You need to return these values correctly
error_train = zeros(m, 1);
error_val = zeros(m, 1);
%模拟学习曲线,随着训练集m的增大,曲线逐渐靠近
n = size(Xval,1); % number of cross validation set
for i = 1:m
%计算theta值,找到拟合参数
theta = trainLinearReg(X(1:i,:), y(1:i,:), lambda);
%计算训练集的 cost
error_train(i) = 1/(2*i) * sum((X(1:i,:)*theta - y(1:i,:)).^2);
%计算交叉验证集的cost
error_val(i) = 1/(2*n) * sum((Xval*theta - yval).^2);
end
3. polyFeatures.m(Maps data into polynomial feature space): address underfitting (high bias).by adding more features.
For use polynomial regression,our hypothesis has the form:
function [X_poly] = polyFeatures(X, p)
% You need to return the following variables correctly.
X_poly = zeros(numel(X), p);
%X_poly(i, :) = [X(i) X(i).^2 X(i).^3 ... X(i).^p];
for i = 1 : p
X_poly(:,i) = X.^i;
end;
end
4.validationCurve.m(Generates a cross validation curve)
function [lambda_vec, error_train, error_val] = ...
validationCurve(X, y, Xval, yval)
% Selected values of lambda (you should not change this)
lambda_vec = [0 0.001 0.003 0.01 0.03 0.1 0.3 1 3 10]';
% You need to return these variables correctly.
error_train = zeros(length(lambda_vec), 1);
error_val = zeros(length(lambda_vec), 1);
m = size(X, 1);
n = size(Xval, 1);
for i = 1 : length(lambda_vec)
lambda = lambda_vec(i);
theta = trainLinearReg(X,y,lambda);
error_train(i) = 1/(2*m) * sum((X*theta - y).^2); % training error
error_val(i) = 1/(2*n) * sum((Xval*theta - yval).^2); % cross validation error
end;
end