python机器学习库sklearn入门(1)工具使用(数据分割、模型评估),来自kaggle竞赛



from sklearn.metrics import mean_absolute_error # 绝对平均误差评估模块
from sklearn.model_selection import train_test_split # 训练集测试集分割模块

# split data into training and validation data, for both features and target
# The split is based on a random number generator. Supplying a numeric value to
# random_state用于确保每次分割结果相同(不改值则不变,不设置则会随机改变)
# 可以设置测试集尺寸,参数为test_size
train_X, val_X, train_y, val_y = train_test_split(X, y, random_state = 0)
# 定义模型
melbourne_model = DecisionTreeRegressor()
# 拟合模型
melbourne_model.fit(train_X, train_y)

# 获取测试集预测结果
val_predictions = melbourne_model.predict(val_X)
# 打印绝对平均误差评估模型好坏
print(mean_absolute_error(val_y, val_predictions))

你可能感兴趣的:(sklearn,python,机器学习)