机器学习中优化算法论文合集

  • 单机确定性优化算法
    • 共轭梯度法:https://nvlpubs.nist.gov/nistpubs/jres/049/jresv49n6p409_A1b.pdf
    • 坐标下降法:http://www.optimization-online.org/DB_FILE/2014/12/4679.pdf
    • 牛顿法: https://www.researchgate.net/publication/221989049_Newton's_method_and_its_use_in_optimization
    • 拟牛顿法:http://www.math.unm.edu/~vageli/courses/Ma576/Dennis_More_SIAMRev77.pdf
    • Frank-Wolfe方法: http://www.math.udel.edu/~angell/Opt/FW.pdf (没找到原始paper)
    • Nesterov加速方法: 链接
    • 内点法: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.136.1990&rep=rep1&type=pdf
       
  • 单机随机优化算法
    • 随机梯度下降: https://pdfs.semanticscholar.org/34dd/d8865569c2c32dec9bf7ffc817ff42faaa01.pdf
    • 随机坐标下降: https://pdfs.semanticscholar.org/ea2d/9bab20d116438409d6dde358db1572f4e547.pdf
    • 随机拟牛顿法: http://users.iems.northwestern.edu/~nocedal/PDFfiles/stochBFGS.pdf
    • 随机对偶坐标上升法: http://www.jmlr.org/papers/volume14/shalev-shwartz13a/shalev-shwartz13a.pdf
       
  • 随机优化算法改进
    • 随机方差缩减梯度法(SVRG): https://papers.nips.cc/paper/4937-accelerating-stochastic-gradient-descent-using-predictive-variance-reduction.pdf
    • 随机平均梯度法(SAG): https://papers.nips.cc/paper/4633-a-stochastic-gradient-method-with-an-exponential-convergence-_rate-for-finite-training-sets.pdf
    • 加速随机平均梯度(SAGA): https://www.di.ens.fr/~fbach/Defazio_NIPS2014.pdf
    • 小批量采样方法:
      • https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/distr_mini_batch.pdf
      • http://www.optimization-online.org/DB_FILE/2011/11/3226.pdf
    • 带权重的采样方法:
      • http://tongzhang-ml.org/papers/icml15-sois.pdf
      • https://pdfs.semanticscholar.org/7173/45b6d9cfcb405ee4fd196ce7db147362f5ba.pdf
      • http://proceedings.mlr.press/v48/allen-zhuc16.pdf
    • 算法组合方法:
      • SVRF: https://arxiv.org/pdf/1602.02101.pdf
      • SVRG-ADMM: fast-and-light stochastic admm 
      • SVRG-BFGS: http://proceedings.mlr.press/v48/gower16.pdf
      • APCG: https://arxiv.org/pdf/1407.1296.pdf
      • MRBCD: https://www.cs.huji.ac.il/~shais/papers/asdcaNIPS.pdf
      • 随机加速梯度下降法: https://pdfs.semanticscholar.org/bef1/beb71c80e79e84f5c428fb5a0a527c1d78b4.pdf?_ga=2.169687969.1838453066.1545624393-187745231.1545624393
  • 非凸随机优化算法(解决深度学习中的非凸问题)
    • 基于冲量的随机梯度下降:http://www.cs.toronto.edu/~fritz/absps/momentum.pdf
    • AdaGrad: http://www.jmlr.org/papers/volume12/duchi11a/duchi11a.pdf
    • RMSProp: Divide the gradient by a running average of its recent magnitude
    • AdaDelta: https://arxiv.org/pdf/1212.5701.pdf
    • Adam: https://arxiv.org/pdf/1412.6980.pdf

你可能感兴趣的:(优化算法)