北京住房价格线性回归项目总结

1.绘制图形

from matplotlib import pyplot as plt
%matplotlib inline

plt.scatter(x, y)
plt.xlabel("Area")
plt.ylabel("Price")

x_temp = np.linspace(50, 120, 100)  # 绘制直线生成的临时点

plt.scatter(x, y)
plt.plot(x_temp, x_temp*w1 + w0, 'r')

2.训练函数

from sklearn.linear_model import LinearRegression

# 定义线性回归模型
model = LinearRegression()
model.fit(x.reshape(len(x), 1), y)  # 训练, reshape 操作把数据处理成 fit 能接受的形状

# 得到模型拟合参数
model.intercept_, model.coef_

3.预测函数

model.predict([[150]])

4.读取csv及提取csv数据

import pandas as pd

df = pd.read_csv(
    "https://labfile.oss.aliyuncs.com/courses/1081/course-5-boston.csv")

features = df[['crim', 'rm', 'lstat']]
features.describe()

5.数据集切分

from sklearn.model_selection import train_test_split

target = df['medv']  # 目标值数据

X_train, X_test, y_train, y_test = train_test_split(
    features, target, test_size=0.3, random_state=1)
X_train.shape, X_test.shape, y_train.shape, y_test.shape

注意:手动切分无法达到train_test_split一样的效果,建议还是使用它

你可能感兴趣的:(线性回归,算法,回归)