Optional lab: Linear Regression using Scikit-LearnⅡ

scikit-learn是一个开源的、可用于商业的机器学习工具包,此工具包包含本课程中需要使用的许多算法的实现

Goals

In this lab you will utilize scikit-learn to implement linear regression using a close form solution based on the normal equation

Tools

You will utilize functions from scikit-learn as well as matplotlib and NumPy.

import numpy as np
np.set_printoptions(precision=2)
from sklearn.linear_model import LinearRegression, SGDRegressor
from sklearn.preprocessing import StandardScaler
from lab_utils_multi import  load_house_data
import matplotlib.pyplot as plt
dlblue = '#0096ff'; dlorange = '#FF9300'; dldarkred='#C00000'; dlmagenta='#FF40FF'; dlpurple='#7030A0'; 
plt.style.use('./deeplearning.mplstyle')

Linear Regression, closed-form solution

Scikit-learn 有实现了closed-form 的线性回归模型 linear regression model

用之前lab中的数据

Size (1000 sqft) Price (1000s of dollars)
1 300
2 500

Load the data set

X_train = np.array([1.0, 2.0])   #features
y_train = np.array([300, 500])   #target value

Create and fit the model

下面的代码使用scikit-learn执行回归
第一步是创建一个回归对象,第二步使用与对象关联的方法之一fit,这将执行回归,将参数与输入数据拟合,该工具包需要一个二维的X矩阵

linear_model = LinearRegression()
#X must be a 2-D Matrix
linear_model.fit(X_train.reshape(-1, 1), y_train) 

输出如下

LinearRegression()

View parameters

参数 w \mathbf{w} w b \mathbf{b} b 在scikit-learn中被称为系数和截距

b = linear_model.intercept_
w = linear_model.coef_
print(f"w = {w:}, b = {b:0.2f}")
print(f"'manual' prediction: f_wb = wx+b : {1200*w + b}")

输出如下

w = [200.], b = 100.00
'manual' prediction: f_wb = wx+b : [240100.]

Make predictions

调用预测函数predict生成预测

y_pred = linear_model.predict(X_train.reshape(-1, 1))

print("Prediction on training set:", y_pred)

X_test = np.array([[1200]])
print(f"Prediction for 1200 sqft house: ${linear_model.predict(X_test)[0]:0.2f}")

输出如下

Prediction on training set: [300. 500.]
Prediction for 1200 sqft house: $240100.00

Second Example

第二个例子来自早期的一个具有多特征的lab,最终的参数值和预测值与该lab未归一化的“长期运行”结果非常接近
未归一化的数据需要运行数小时才能产生结果,而下面的运行几乎是瞬间的,the closed-form solution 在较小数据集上运行良好,但在较大的数据集上可能对计算的要求很高
The closed-form solution 不需要归一化

# load the dataset
X_train, y_train = load_house_data()
X_features = ['size(sqft)','bedrooms','floors','age']
linear_model = LinearRegression()
linear_model.fit(X_train, y_train)
b = linear_model.intercept_
w = linear_model.coef_
print(f"w = {w:}, b = {b:0.2f}")

输出如下

w = [  0.27 -32.62 -67.25  -1.47], b = 220.42
print(f"Prediction on training set:\n {linear_model.predict(X_train)[:4]}" )
print(f"prediction using w,b:\n {(X_train @ w + b)[:4]}")
print(f"Target values \n {y_train[:4]}")

x_house = np.array([1200, 3,1, 40]).reshape(-1,4)
x_house_predict = linear_model.predict(x_house)[0]
print(f" predicted price of a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old = ${x_house_predict*1000:0.2f}")

输出如下

print(f"Prediction on training set:\n {linear_model.predict(X_train)[:4]}" )
print(f"prediction using w,b:\n {(X_train @ w + b)[:4]}")
print(f"Target values \n {y_train[:4]}")

x_house = np.array([1200, 3,1, 40]).reshape(-1,4)
x_house_predict = linear_model.predict(x_house)[0]
print(f" predicted price of a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old = ${x_house_predict*1000:0.2f}")

Congratulations!

In this lab you:

  • 使用了一个开源的机器学习工具包scikit-learn
  • 使用该工具包中的 close-form solution 实现了线性回归

你可能感兴趣的:(机器学习,线性回归,scikit-learn,算法,机器学习,笔记,python,经验分享)