目录
1.导入所用的数据包
2.导入模型建立所需要的数据
3.数据集划分
4.导入训练包与交叉验证包(LGBM)
5.模型训练
6.模型验证及画图
7.画图展示模型验证情况
8.特征重要性分析
数据集链接
S. Thai, H. Thai, B. Uy, T. Ngo, M. Naser, Test Database on Concrete-Filled Steel
Tubular Columns, 2019, https://doi.org/10.17632/3XKNB3SDB5.1
导入pandas及numpy包
import pandas as pd
import numpy as np
import warnings
import joblib
warnings.filterwarnings("ignore")
该回归模型特征数为5个,为回归模型初步使用MSE作为评价指标
train= pd.read_csv('F:\\database\\cfst\\cacfsttraint.csv')#导入数据数据每一行数据为5个特征一个标签
dem=train[["D (mm)","t (mm)","Le (mm)","fy (MPa)","fc (MPa)"]].values
#特征数据定义
object=train[["N Test (kN)"]].values
#标签数据定义
划分数据集为测试集和验证集,以0.7为划分batch,0.3的test
from sklearn.model_selection import train_test_split
x_train,x_test,y_trian,y_test=train_test_split(dem,object,test_size=0.3,shuffle=True)
from sklearn.metrics import mean_squared_error
import lightgbm as lgb
from sklearn.model_selection import KFold
mean_squared_error为评价指标
for train_indices,test_indices in kf.split(dem):
X_train,X_test=dem[train_indices],dem[test_indices]
Y_train, Y_test = object[train_indices], object[test_indices]
lgbmRLR=lgb.LGBMRegressor()
lgbmRLR.fit(X_train,Y_train)
y_pred = lgbmRLR.predict(X_test)
RSM=np.sqrt(mean_squared_error(np.log(Y_test),np.log((y_pred))))
rmse.append(RSM)
可以添加print(RSM)观察其损失变化
test= pd.read_csv('F:\\database\\cfst\\test.csv')#导入测试集
test_pred=dlgbnmodel.predict(test.values)
result_df=pd.DataFrame(columns=["n_max"])#将预测结果写入excel——csv
result_df["n_max"]=test_pred
joblib.dump(filename='tiaocanLgbm.model2',value=dlgbnmodel)
result_df.to_csv("dllgbr_base.csv",index=None,header=True)
x=result_df["n_max"].plot(figsize=(16,8))
plt.show()
plt.figure(figsize=(12,6))
lgb.plot_importance(dlgbnmodel, max_num_features=30)
plt.title("Featurertances")
plt.show()
模型预测数据图