multiple_linear Regression with Python

注意python版本,目前还在坑里,

10月29日,OK 搞定了将Spyder编译器改成3.2.3,Python版本是3.62,问题解决

# -*- coding: utf-8 -*-

"""

Spyder Editor

This is a temporary script file.

"""

import numpy as np

import matplotlib.pyplot as plt

import pandas as pd

dataset = pd.read_csv('50_Startups.csv')

X = dataset.iloc[:, :-1].values

Y = dataset.iloc[:,4].values

from sklearn.preprocessing import LabelEncoder, OneHotEncoder

labelencoder_x = LabelEncoder()

X[:, 3] = labelencoder_x.fit_transform(X[:,3])

onehotencoder = OneHotEncoder(categorical_features = [3])

X =  onehotencoder.fit_transform(X).toarray()

X = X[:, 1:]

from sklearn.cross_validation import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size = 0.2, random_state = 0)

from sklearn.linear_model import LinearRegression

regressor = LinearRegression()

regressor.fit(X_train,y_train)

#查看结果

y_pre = regressor.predict(X_test)

#用后退梯度

import statsmodels.formula.api as sm

#axis = 1 最右边

X = np.append(arr = np.ones((50,1)).astype(int),values = X ,axis = 1)

X_opt = X[:,[0,1,2,3,4,5]]

#ALL in

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

multiple_linear Regression with Python_第1张图片
All in

X_opt = X[:,[0,1,3,4,5]]

#梯度递减

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

multiple_linear Regression with Python_第2张图片
删除state2

X_opt = X[:,[0,3,4,5]]

#梯度递减

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

multiple_linear Regression with Python_第3张图片
删除State3


X_opt = X[:,[0,3,5]]

#梯度递减

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

multiple_linear Regression with Python_第4张图片
继续删除偏差最大,影响最小

X_opt = X[:,[0,3]]

#梯度递减

regressor_OLS = sm.OLS(endog = Y, exog = X_opt).fit()

regressor_OLS.summary()

multiple_linear Regression with Python_第5张图片

你可能感兴趣的:(multiple_linear Regression with Python)