线性回归

最小二乘法线性回归:sklearn.linear_model.LinearRegression(fit_intercept=True, normalize=False,copy_X=True, n_jobs=1)

主要参数说明:

fit_intercept:布尔型,默认为True,若参数值为True时,代表训练模型需要加一个截距项;若参数为False时,代表模型无需加截距项。

normalize:布尔型,默认为False,若fit_intercept参数设置False时,normalize参数无需设置;若normalize设置为True时,则输入的样本数据将(X-X均值)/||X||;若设置normalize=False时,在训练模型前, 可以使用sklearn.preprocessing.StandardScaler进行标准化处理。

属性:

coef_:回归系数(斜率)

intercept_:截距项

主要方法:

①fit(X, y, sample_weight=None)

②predict(X)

③score(X, y, sample_weight=None),其结果等于1-(((y_true - y_pred) **2).sum() / ((y_true - y_true.mean()) ** 2).sum())

利用sklearn自带的糖尿病数据集,建立最简单的一元回归模型

In [1]:importnumpyasnp

...:fromsklearnimportdatasets , linear_model

...:fromsklearn.metricsimportmean_squared_error , r2_score

...:fromsklearn.model_selectionimporttrain_test_split

...:#加载糖尿病数据集

   ...: diabetes = datasets.load_diabetes()

...: X = diabetes.data[:,np.newaxis ,2]#diabetes.data[:,2].reshape(diabetes

...: .data[:,2].size,1)

   ...: y = diabetes.target

...: X_train , X_test , y_train ,y_test = train_test_split(X,y,test_size=0.2

...: ,random_state=42)

   ...: LR = linear_model.LinearRegression()

   ...: LR.fit(X_train,y_train)

...: print('intercept_:%.3f'% LR.intercept_)

...: print('coef_:%.3f'% LR.coef_)

...: print('Mean squared error: %.3f'% mean_squared_error(y_test,LR.predict

...: (X_test)))##((y_test-LR.predict(X_test))**2).mean()

...: print('Variance score: %.3f'% r2_score(y_test,LR.predict(X_test)))#1-(

...: (y_test-LR.predict(X_test))**2).sum()/((y_test - y_test.mean())**2).sum

   ...: ()

...: print('score: %.3f'% LR.score(X_test,y_test))

...: plt.scatter(X_test , y_test ,color ='green')

...: plt.plot(X_test ,LR.predict(X_test) ,color='red',linewidth =3)

   ...: plt.show()

   ...:

intercept_:152.003

coef_:998.578

Mean squared error:4061.826

Variance score:0.233

score:0.233

效果如下:

你可能感兴趣的:(线性回归)