House Price Regression Process:Stacked Regressions

房屋销售价格回归预测的项目有很多人公开了其Kernel, 其中Serigne的“Stacked Regressions to predict House Prices”为多数人所阅读。读者可以在Kaggle网站上直接浏览。本文做了一些总结,把主要的流程步骤列表如下,读者可以厘清思路。

Stacked Regressions to predict House Prices. 0

Data Processing. 5

Outliers. 5

Note : 5

Target Variable¶. 5

Log-transformation of the target variable. 7

Features engineering. 8

Missing Data. 8

Data Correlation. 8

Imputing missing values. 8

More features engeneering¶. 9

Transforming

some numerical variables that are really categorical 9

Label Encoding some categorical variables

that may contain information in their ordering set 9

Adding one

more important feature. 9

Skewed

features. 9

Getting

dummy categorical features Getting the new train and test sets. 10

Modelling. 10

Import librairies. 10

Define a cross validation strategy. 10

Base models  10

StackedRegressions  to predict House Prices. 0

Data Processing. 5

Outliers. 5

Note : 5

Target Variable¶. 5

Log-transformation of the target variable. 7

Features engineering. 8

Missing Data. 8

Data Correlation. 8

Imputing missing values. 8

More features engeneering . 9

Transforming some numerical variables that are really categorical 9

Label Encoding some

categorical variables that may contain information in their ordering set 9

Adding one more important feature. 9

Skewed features. 9

Getting dummy categorical features Getting the new train and test sets. 10

Modelling. 10

Import librairies. 10

Define a cross validation strategy. 10

Base models. 10

LASSO Regression : 10

Elastic Net Regression : 11

Kernel Ridge Regression : 11

Gradient Boosting Regression : 11

XGBoost 11

·       LightGBM.. 11

Base models scores. 11

Stacking models. 11

Simplest Stacking approach : Averaging base models. 11

Averaged base models class. 11

Averaged base models score. 11

Less simple Stacking : Adding a

Meta-model 12

Stacking averaged Models Class. 13

Stacking Averaged models Score. 13

Ensembling StackedRegressor, XGBoost and LightGBM.. 13

Final Training and Prediction. 13

Stacked Regressor: 13

XGBoost: 14

Ensemble prediction: 15

Submission. 15

Comments. 16

Leader Board Ranking: 17

RMSLE score on train data: 0.07658856703780222  18

你可能感兴趣的:(House Price Regression Process:Stacked Regressions)