【实用机器学习】3.2 DecisionTrees

记录学习过程
【3.2 最简单也最常用的决策树【斯坦福21秋季:实用机器学习中文版】】

决策树(Decision Trees)可以用来做分类(Classification)和回归(regression tree) 任务

Decision Trees

  • Pros
    • Explainable(可解释性 )
    • Can handle both numerical and categorical features(可以同时处理数值类(大于或小于)和类别类的特征)
  • Cons
    • Very non-robust (ensemble(集成学习) to help) (很可能被数据的噪音影响)
    • Complex trees cause over-fitting (prune trees(剪枝))
    • Not easy to be parallelized in computing(顺序化过程 比较难并行 性能会稍微吃亏)

Random Forest

  • Train multiple decision trees to improve robustness
    • Each tree is trained independently
    • Majority voting for classification, average for regression(分类问题就投票,回归问题就取平均)
    • 代价就是训练成本高一点点
  • Where is the randomness from?(两种随机的方式)
    • Bagging: randomly sample training examples with replacement (可以重复地取出一组数据单独训练,重复这个过程,训练n棵树)
      • E.g. [1,2,3,4,5] → \rightarrow [1,2,2,3,4](采样可能是有重复的)
    • Randomly select a subset of features.(对特征随机,不要用整个特征)

Gradient Boosting Decision Trees

  • Train multiple trees sequentially (跟之前一样是训练很多树,但是不再是独立地完成,而是顺序地完成,这些树一起能合成一个比较大的模型)
  • At step t = 1 , … t=1, \ldots t=1,, denote by F t ( x ) F_t(x) Ft(x) the sum of past trained trees(过去所有树训练的和,每一个树是一个函数,你的输出是那些每个数的输出的和加起来)
    • Train a new tree f t f_t ft on residuals: { ( x i , y i − F t ( x i ) ) } i = 1 , … \left\{\left(x_i, y_i-F_t\left(x_i\right)\right)\right\}_{i=1, \ldots} {(xi,yiFt(xi))}i=1, (在接下来的时间t里面训练一颗新的树,它不在原始的数据上,而是在残差(真实值和预测值之间的差,这个模型没有做好的那一块)的数据上,再训练一棵树) 用这种方式更加靠近真实值
    • F t + 1 ( x ) = F t ( x ) + f t ( x ) F_{t+1}(x)=F_t(x)+f_t(x) Ft+1(x)=Ft(x)+ft(x)
  • The residual equals to − ∂ L / ∂ F -\partial L / \partial F L/F if using mean square the loss, so it’s called gradient boosting ()(等价于去了一个平均均方误差,每一次去训练一个新的树来拟合梯度的负数)(具体梯度下降的定义在这一节)

Summary

  • Decision tree: an explainable model for classification/regression
  • Ensemble trees to reduce bias and variance(决策树对数据的噪音非常敏感,具体偏移和方差的定义在这一节)
    • Random forest: trees trained in parallel with randomness(随机并行训练树)
    • Gradient boosting trees: train in sequential on residuals(顺序地训练一些数,每一棵新的树都是对之前的数预测的不准的那一块部分去继续拟合)
  • Trees are widely used in industry
    • Simple, easy-to-tune, often gives satisfied results(训练简单,没有太多超参数可以调,容易给出比较好的结果)

你可能感兴趣的:(决策树,算法)