监督学习,supervised learning

无监督学习,unsupervised learning

分类,classificat

回归,regression

降维,dimensionality reduction

聚类,clustering

特征向量,feature vector

编译语言,complied languages

解释型语言,interpreted languages

解释器,interpreter

布尔值,boolean

元组,tuple

算术运算,arithmetic operators

比较运算,comparison operators

赋值运算,assignment operators

逻辑运算,logical operators

成员运算,menbership operators

二分类,binary classification

多分类,multiclass classification

多标签分类,multi-lable classification

线性分类器,linear classification

系数,coefficient

截距,intercept

参数,parameters

随机梯度上升,stochastic gradient ascend(SGA)

预测结果,predicted condition

正确标记,true condition

混淆矩阵,confusion matrix

准确性,accuracy

召回率,recall

精确率,precision

随机梯度下降模型,SGD Classifier

支持向量机分类器,support vector classifier

朴素贝叶斯,naive bayes

K近邻分类器,KNeighborsClassifier

无参数模型,nonparametric model

信息熵,information gain

基尼不纯性,gini impurity

集成,ensemble

单一决策树,decision tree

随机森林分类器,random forest classifier

梯度提升决策树,gradient tree boosting

平均绝对误差,mean absolute error(MAE)

均方误差,mean squared error(MSE)

极端随机森林,extremely randomized trees

随机回归森林,randomforestregressor

极端回归森林,extratreesregressor

核函数,kernal


scikit-learn

针对房价预测的回归预测能力排名,R-squared(用来衡量模型回归结果的波动可被真实值验证的百分比,也暗示了模型在数值回归方面的能力)

1,gradient boosting regressor,0.8426

2,extra trees regressor,0.8195

3,random forest regressor,0.8024

4,SVM regressor(RBF kernel),0.7564

5,KNN regressor(distance-weighted),0.7198

6,decision tree regressor,0.6941

7,KNN regressor(uniform-weighted),0.6903

8,linear regressor,0.6763

9,SGDregressor,0.6599

10,SVM regressor(linear kernel),0.6517

11,SVM regressor(poly kernel),0.4045


泛化力,generalization

正则化,regularization

过拟合,overfitting


留一验证,leave-one-out cross validation

交叉验证,K-flod cross-validation